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DESCRIPTION 
LIPOPROTEINS AS NUCLEIC ACID VECTORS 

BACKGROUND OF THE INVENTION 

The present application is a continuation-in-part of co-pending U.S. Patent Application 
5 Serial No. 08/874,807 Entitled "Lipoproteins As Nucleic Acid Vectors" filed June 13, 1997. 
The entire text of the above-referenced disclosure is specifically incorporated by reference 
herein without disclaimer. 

L Field of the Invention 

The present invention relates to materials and methods for the in vivo transport and 
delivery of nucleic acids. More particularly, it concerns the use of lipoproteins, including but 
not limited to. low density lipoproteins ("LDL"), and/or apolipoproteins for the in vivo transport 
of nucleic acids. In addition, the present invention relates to the use of lipoproteins in the early 
detection of cancer and/or metastatic cancer and/or arteriosclerosis. 

15 

2. Description of Related Art 

The ultimate curative method for any genetic disorder, whether the disorder is inherited 
or results from a mutation, depends on an effective mode of replacing or augmenting non- 
ftmciional gene(s). This process is now termed gene or genetic therapy. There are two 

20 important aspects to genetic therapy, the gene delivery system/vehicle and the gene 
control/expression program. Ideally, a replacement gene should become resident in the genome 
of the target cells/organism and be transferable to subsequent generations of cells and progeny, 
i.e., the change is incorporated into the germ cells or reproductive cells, the sperm and ovary. 
Although there have been several significant breakthroughs in this field, this area of 

25 biotechnology is still in its early development phase. The first step in any approach to gene 
replacement is the delivery of the specific gene (nucleic acid) to the cells. 
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Many techniques have been and are being developed lo deliver and express genes in 
cells and specific tissues in mammals in vivo. Several general, non-specific methods for 
delivering genes have been reported involving aerosol nucleic acid deliver}' to cells (Stribling et 
ai, 1992); calcium phosphate precipitation, using a steep change in ionic strength (Wigler et ai, 
1979); DEAE-dextran (Sompayrac et al, 1981); electroporation, forcing the nucleic acid into 
the cell by using an electric field or current (Neumann et al, 1982): microinjection, physically 
injecting the nucleic acid into a cell (Benvensty et ai, 1986; Wolff et al, 1990); and 
polycationic molecules such as polylysine polypeptides (Curiel et ai, 1992) and cationic lipids 
(Lee et al. 1996). 

Liposomes, vesicles composed of synthetic or non-natural lipids such as long-chain fatty 
adds, can be used to carr>' the nucleic acid into the cell cytoplasm non-specifically (Feigner et 
ai. 1987). A recent invention describes the delivery of a self-initiating and self-sustaining gene 
expression system which contains an RNA polymerase prebound to a DNA molecule using the 
aforementioned nucleotide delivery systems (U.S. Patent No. 5,59L601). 

Viral vectors in which specific nucleic acid sequences are incorporated into a neutralized 
or inactivated virus can use their viral entry mechanism to gain entry to the cell cytoplasm via 
specific cellular receptors to deliver nucleic acids (Schimotohono et ai, 1981). The use of 
specific cellular receptors is apparently a more specific method for delivering genes. In this 
approach, the nucleic acid is bound either freely, through charge association, or alternatively it 
is chemically and non-reversibly conjugated to proteins with specific receptor proteins on the 
membrane of target cells for receptor-mediated uptake (Wu et aL, 1988, Wu et ai, 1989). 

Techniques such as calcium phosphate precipitation, electroporation or DEAE-dextran 
transfection are not suitable for in vivo applications. Bombarding cells with nucleic acids imder 
high pressure is a technique which has very limited applications in that it can only be applied 
topically and only a small number of cells can be targeted. Microinjection of nucleic acids into 
cells is mainly performed in vitro and requires actively dividing cells. 
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Gene delivery systems that use the viral entry mechanism of recombinant viral vectors 
have major disadvantages. Systems that utilize replication-defective adenoviral vectors can 
infect a wide variety of eukaryotic cell types including quiescent somatic cells utilizing the viral 
entry mechanism. However, adenoviral vector-based delivery systems are only successful in 

5 transient gene expression and repeated administration of the viral vector results in a strong 
immunological response of the host. In addition, the host will experience an adenoviral 
infection and can experience its symptoms if the recombinant vector undergoes homologous 
recombination with the wild-type virus strain. Systems that employ recombinant retroviral 
vectors can be used for stable integration of the gene of interest into the host's genome, but only 

10 actively dividing cells can be targeted. In addition, the disadvantages of the adenoviral vector 
systems also apply to retroviral vector systems (immune response, disease etc.). 

Positively-charged polycaiionic molecules such as polylysine peptides which bind non- 
specifically to the negatively charged nucleic acids have been used to introduce DNA into the 

15 chromosome of the recipient cell or organism. Cationic lipid vesicles, liposomes and micelles 
have been used in aggregates with DNA and viral envelope glycoproteins in non-specific 
delivery of genes. Liposomes, vesicles composed of synthetic or non-natural lipids, such as 
long-chain fatty acids, can be used to carry the nucleic acid into the cell cytoplasm non- 
specifically. In these systems, the liposomes are structured to "best fit" the nucleic acid and 

20 insertion into the cell is through non-specific uptake. 

The interaction of the liposomal delivery systems discussed above with the nucleic acid 
to be delivered is non-specific. In addition, prior art techniques are designed to deliver multiple 
copies of the nucleic acid to the cell cytoplasm. Optimally, however, only one or two copies of 
25 a gene should be transfected per cell throughout the organism to replace a defective set of genes 
only in the specific cells and tissues where it would normally be expressed. 

Thus there is a need for a safe and efficient gene delivery system that may be employed 
in the burgeoning filed of gene therapy. 

30 
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SUMMARY OF THE PREvSENT INVENTION 

The present invention contemplates a gene delivery system for use in gene therapy. 
Thus in particular embodiments, the present invention provides a composition comprising an 
isolated polypeptide comprising at least one LDL or VLDL nucleic acid binding domain; and a 

5 nucleic acid comprising an LDL or VLDL binding sequence, wherein the nucleic acid is bound 
to the polypeptide. In particularly preferred embodiments, the polypeptide comprises an LDL 
nucleic acid binding domain. In other embodiments, the polypeptide comprises a VLDL 
nucleic acid binding domain. In particular aspects of the present invention, the nucleic acid 
comprises an expression region operably linked to a promoter active in eukaryotic cells. In 

10 more particular embodiments, the expression region encodes a polypeptide. In other preferred 
embodiments, the expression region comprises an antisense construct. 

In those embodiments in which the expression region encodes a polypeptide, the 
polypeptide may be selected from the group consisting of a-globin, p-globin, y-globin, 

15 granulocyte macrophage-colony stimulating factor (GM-CSF), tumor necrosis factor (TNF), IL- 
2, IL.3, IL.4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-IO, IL-II, IL-13, IL-14, IL-15, P- 

interferon, y-interferon, cytosine deaminase, adenosine deaminase, ^-glucuronidase, 
hypoxanthine guanine phosphoribosyl transferase, galactose- 1 -phosphate uridyltransferase, 
glucocerbrosidase, gluco5e-6-phosphatase, thymidine kinase, lysosomal glucosidase, growtii 

20 hormone, nerve grovvih factor, insulin, adrenocorticotropic hormone, parathormone, follicle- 
stimulating hormone, luteinizing hormone, epidermal growth factor, thyroid stimulating 
hormone of CFTR, EGFR, VEGFR, lL-2 receptor, estrogen receptor, Bax, Bak, Bcl-X^, Bik, 
Bid, Bad, Harakiri, Ad ElB, an ICE-CED3 protease neomycin resistance, luciferase, adenine 
phosphoribosyl transferase (APRT), retinoblastoma, insulin, mast cell growth factor, p53, pi 6, 

25 p21, MMACL p73, zacl and BRCAL 

In those embodiments in which the expression region comprises an antisense construct, 
the antisense construct may be complementary to a segment of an oncogene. In more preferred 
embodiments, the oncogene may be selected from the group consisting of ras, myc, neu, raf 
30 erb, src, fms, jim, irk ret, gsp, hst, bcl and abL 
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The expression region may be linked to a promoter selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, p-actin, human globin a. human globin P and human globin y 
promoter. In a defined embodiment, the nucleic acid binding domain is an apoBlOO nucleic 
acid binding domain. In other embodiments, the composition of the present invention may 
further comprise one or more lipoproteins selected from the group consisting of apoAl, apoA- 
II, apoA-IV, acat, apoE, apoC-II, apoC-lII and apo-D. In particularly preferred embodiment, 
the apoBlOO is selected from the group consisting of human, rat and baboon apoBlOO. 

In particular aspects of the invention, the polypeptide comprises at least two nucleic acid 
binding domains. In particularly preferred embodiments, the nucleic acid binding domain 
contains a motif selected from the group consisting of a proline pipe helix DNA binding motif, 
a ISGF3y-like DNA binding motif, a SREBP-like DNA binding motif, a coiled-coil motif and a 
nucleotide (ATP)-binding motif. In more defined embodiments, the binding domain may be 
selected from the group consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID 
NO:82, SEQ ID NO:83, SEQ ID NO:85. SEQ ID NO:86, SEQ ID NO:87. SEQ ID NO:88, SEQ 
ID NO:89, SEQ ID NO:90. SEQ ID N0:91. SEQ ID NO:92. SEQ ID NO:93, SEQ ID NO:94, 
SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98. SEQ ID NO:99, SEQ ID 
NO:100, SEQ ID NO:10l, SEQ ID NO:102, SEQ ID NO:103. SEQ ID NO:105, SEQ ID 
NO:I06, SEQ ID NO:107, SEQ ID NO:108, SEQ ID NO:109. SEQ ID NO:110, SEQ ID 
NO: 1 1 1 , SEQ ID NO: 1 1 2, NO: 11 3, SEQ ID NO: 1 14, SEQ ID NO: 1 1 5, SEQ ID NO: 144, SEQ 
ID NO:145. SEQ ID N0:146. SEQ ID NO:147, SEQ ID NO:148, SEQ ID NO:149, SEQ ID 
NO: 1 50. SEQ ID NO: 151, SEQ ID NO: 152, SEQ ID NO: 153, SEQ ID NO: 154, SEQ ID 
NO: 163, SEQ ID NO: 1 64, SEQ ID NO: 165, SEQ ID NO: 166 and SEQ ID NO: 175. 

In other embodiments, the polypeptide may further comprise at least one nuclear 
localization sequence. More particularly, the nuclear localization sequence may be from 
apoBlOO. In more preferred embodiments, the nuclear localization sequence may be selected 
from the group consisting of SEQ ID NO: 178, SEQ ID NO: 179, SEQ ID NO: 180, SEQ ID 
NO: 194, SEQ ID NO: 195, SEQ ID NO: 196, SEQ ID NO: 197, SEQ ID NO: 198, SEQ ID 
NO: 199. SEQ ID NO: 200, SEQ ID NO: 201, SEQ ID NO: 202, SEQ ID NO: 203, SEQ ID 
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NO: 204, SEQ ID NO: 205, SEQ ID NO: 206, SEQ ID NO: 207, SEQ ID NO: 208, SEQ ID 
NO: 209, SEQ ID NO: 210. . 

Also contemplated by the present invention is a method for expressing a polypeptide in 
a human cell comprising the steps of providing a composition comprising (i) an isolated 
polypeptide comprising at least one LDL or VLDL nucleic acid binding domain and (ii) a 
nucleic acid comprising an expression cassette comprising a sequence encoding the polypeptide 
and a promoter active in eukaryotic cells, wherein the coding sequence is operably linked to the 
promoter, and wherein the nucleic acid sequence is bound to the LDL or VLDL; contacting the 
composition with the cell under conditions permitting transfer of the composition into the cell; 
and culturing the cell under conditions permitting the expression of the polypeptide. 

In particularly preferred embodiments, the polypeptide independently, is a tumor 
suppressor, a cytokine, an enzyme, a hormone, a receptor, or an inducer of apoptosis. In 
preferred embodiments, tlie tumor suppressor may be selected from the group consisting of p53, 
pl6, p21, MMACl, p73, zacL BRCAI and Rb. In preferred embodiments, the cytokine may be 
selected from the group consisting of IL-2, IL-2, IL-3, IL-4, IL-5, IL-6, lL-7, IL-8, lL-9, IL-10, 
IL-11, lL-12. lL-13, IL-14, IL-15, TNF, GMCSF. (i-interferon and y-interferon. In other 
preferred embodiments, the enzyme may be selected from the group consisting of cytosine 
deaminase, adenosine deaminase, ^-glucuronidase, hypoxanthine guanine phosphoribosyl 
transferase, galactose- 1 -phosphate uridyl transferase, glucocerbrosidase, glucose-6-phosphatase, 
thymidine kinase and lysosomal glucosidase. In still further preferred embodiments, the 
hormone may be selected from the group consisting of growth hormone, nerve growth factor, 
insulin, adrenocorticotropic hormone, parathormone, follicle-stimulating hormone, luteinizing 
hormone, epidermal growth factor and thyroid stimulating hormone. In defined embodiments, 
the receptor may be selected from the group consisting of CFTR, EGFR, VEGFR, lL-2 receptor 
and the estrogen receptor. In other preferred embodiments, the inducer of apoptosis may be 
selected from the group consisting of Bax, Bak, Bcl-X^, Bik, Bid, Bad, Harakiri, Ad El B and an 
ICE-CED3 protease. 
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In particularly preferred embodiments, the nucleic acid binding domain is an apoBlOO 
nucleic acid binding domain. In more preferred embodiments, the apoBlOO may be selected 
from the group consisting of human, rat and baboon low density apoBlOO. In still further 
preferred embodiments, the binding region is selected from the group consisting of a proline 
pipe helix DNA binding motif, a ISOFSy-like DNA binding motif, a SREBP-like DNA binding 
motif, a coiled-coil motifs, and a nucleotide (ATP)-binding motif In particular embodiments, 
the polypeptide further may comprise at least one nuclear localization sequence. In especially 
preferred embodiments, the nuclear localization sequence is derived from an apoBlOO nuclear 
localization sequence. In specific embodiments, the polypeptide may be selected from the 
group consisting of a-globin. p-globin, yglobia neomycin resistance, luciferase, adenine 
phosphoribosyl transferase ( APRT), and mast cell growth factor. 

Also provided is a method for providing an expression constmct to a human cell 
comprising providing a composition comprising (i) an isolated polypeptide comprising at least 
one LDL or VLDL nucleic acid binding domain and (ii) an expression cassette comprising a 
nucleic acid sequence encoding an expression region and a promoter active in eukaryotic cells, 
wherein the expression region is opcrably linked to the promoter, and wherein the nucleic acid 
sequence is bound to the LDL or VLDL; contacting the composition with the cell under 
conditions permitting transfer of the composition into the cell: and culturing the cell under 
conditions permitting the expression of the expression region. 

In particularly preferred embodiments, the expression construct comprises an antisense 
constmct. In more preferred embodiments, the antisense constmct is derived from an oncogene. 
In exemplary embodiments, the oncogene may be selected from the group consisting ras. myc. 
neu, raf. erb. src. Jms. '^^^ S^P. hst. bd and abl In other embodiments, the expression 
constmct comprises a nucleic acid coding for a gene. In preferred aspects the gene encodes a 
polypeptide. 

In particularly preferred embodiments, the nucleic acid binding domain is an apoBlOO 
nucleic acid binding domain. The apoBlOO may be selected from the group consisting of 
human, rat and baboon low density apoBlOO. In other preferred embodiments, the DNA 
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binding region is selected from the group consisting of a proline pipe helix DNA binding motif, 
a ISGFSy-like DNA binding motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and 
a nucleotide (ATP)-binding motif. 

Further the present invention contemplates a method for treating a human disease 
comprising providing a composition comprising (i) an isolated polypeptide comprising at least 
one LDL or VLDL nucleic acid binding domain and (ii) an expression cassette comprising a 
nucleic acid sequence encoding an expression region and a promoter active in eukaryotic cells, 
wherein the expression region is operably linked to the promoter, and wherein the nucleic acid 
sequence is bound to the LDL or VLDL; and administering the composition to a human subject 
having the disease under conditions permitting transfer of the composition into cells of the 
human subject. 

In specific embodiments, the disease may be selected from the group consisting of 
cancer, diabetes, cystic fibrosis and arteriosclerosis. In preferred embodiments the polypeptide 
comprises at least two nucleic acid binding regions. In other preferred embodiments the 
polypeptide comprises at least one nuclear localization sequence. In particularly prefened 
embodiments, the nucleic acid encodes a gene. In other preferred embodiments, the expression 
construct comprises an antisense construct. 

Another aspects of the presem invention describes a pharmaceutical composition 
comprising an isolated polypeptide comprising at least one LDL or VLDL nucleic acid binding 
domain; and a nucleic acid comprising an LDL or VLDL binding sequence, wherein the nucleic 
acid is bound to the polypeptide; the pharmaceutical composition being dispersed in a suitable 
diluent. 

Also contemplated by the present invention is a method of transforming a cell 
comprising providing a cell; contacting the cell with a composition comprising (i) an isolated 
polypeptide comprising at least one LDL or VLDL nucleic acid binding domain and (ii) an 
expression cassette comprising a nucleic acid sequence encoding an expression region and a 
promoter active in eukaryotic cells, wherein the expression region is operably linked to the 
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promoter, and wherein the nucleic acid sequence is bound lo the LDL or VLDL; wherein 
expression of the expression region is indicative of the transformation. 

Yet another aspect of the present invention contemplates a method of transfecting a cell 
comprising the steps of providing a cell; contacting the cell with a composition comprising (i) 
an isolated polypeptide comprising at least one LDL or VLDL nucleic acid binding domain and 
(ii) an expression cassette comprising a nucleic acid sequence encoding an expression region 
and a promoter active in eukaryotic cells, wherein the expression region is operably linked to 
the promoter, and wherein the nucleic acid sequence is bound to the LDL or VLDL; wherein 
expression of the expression region is indicative of the transfection. 

Other objects, features and advantages of the present invention will become apparent 
from the following detailed description. It should be understood, however, that the detailed 
description and the specific examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only, since various changes and modifications within the spirit 
and scope of the invention will become apparent to those skilled in the an from this detailed 
description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The following drawings form part of the present specification and are included to further 
demonstrate certain aspects of the present invention. The invention may be better understood 
by reference to one or more of these drawings in combination with the detailed description of 
specific embodiments presented herein. 

FIG. 1 A-FIG. IC show the amino acid sequence of apoB-100. 

FIG. 2A-FIG.2F is a homology alignment of SH3-like regions in apo B-lOO v^th 
known SH3 domains of signal transduction proteins. FIG. 2A-FIG. 2D are the homology 
alignments and FIG. 2E and FIG. 2F identify the regions of apo B-lOO and the proteins aligned. 
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FIG. 3A-FIG. 3D show a comparison of SH2-Hke regions in apo B-lOO to known SH3 
domains of signal transduction proteins. FIG. 3A-F1G. 3C are the homology alignments. 
FIG. 3D identifies the proteins and regions aligned. 

FIG. 4A-FIG. 4C show a comparison of the apo B-lOO SH 1-like region to SHI kinase 
domains of known signal transduction proteins. FIG. 4A and FIG. 43 shows the alignments; 
FIG. 4C identifies the proteins and regions aligned. 

FIG. 5A and FIG. 5B show the inter-kringle proline-rich regions of Apola] compared 
with the proline rich region of SH3-binding protein (3BP1). FIG. 5A shows the alignment;, 
FIG. 5B identifies the proteins and regions aligned. 

FIG. 6A and FIG. 6B show an homology aligmnent of specific regions of apo B-lOO and 
the activation regions located at the amino- and carboxyl- termini of signal transduction 



proteins. 



. 7 illustrates the homology of specific regions of apo B-lOO with proline pipe helix 



FIG 
motifs of Tus. 

FIG. S.VFIG. 8D show a homology alignmem among one region of the DNA-binding 
protein ISGF3y and similar regions in apo B-100. 

FIG. 9A-FIG. 9D show a homology alignmem among regions of the DNA-binding 
protein ISGFjy and similar regions in apo B-100. 

FIG.IOA-FIG. ION. FIG. ION shows a sequence comparison of the DNA-binding 
domains of the SREBPl, SREBP2, and ADDl proteins with similar regions found in apo 
B-IOO. FIG. lOB-FIG. ION show a sequence comparison of the DNA-binding domains of 
SREBP 1 with various apolipoproteins. 
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FIG. 1 1 shows a comparison of the primary structures of known coiled-coil regions of 
DNA-binding proteins and analogous regions in apo B-100. 

FIG. 12A-FIG. 12C show a comparison of known ATP-binding loop motifs to similar 
regions in apo B-100. 

FIG. 13A-FIG. 13E show a comparison of known nuclear localization signal sequences 
to similar regions in apo B- 1 00. 

FIG. 14A-F1G. 14J show a comparison of human apo B-100 regions with sequenced 
regions of apo B-100 from other species. 

FIG. 15 shows the composition of the LDL gene delivery system of the instant invention 
LDL containing apo B- 1 00 is depicted along with a DN A sequence containing a promoter, a 
protein coding region, a 3' untranslated region, and a non-coding region. 

DETAILED DESCRIPTION OF THE PREFERRED EM BODIMENT 

The present invention arises from the discovery that regions of apolipoproteins, the 
protein fraction of lipoprotein panicles, are similar in primary structure and amino acid 
sequence to cellular proteins which are known to bind to DNA. Presently, the only known 
functions of lipoproteins VLDL. IDL. LDL and HDL are the solubilization and transport of 
hydrophobic lipids in plasma. The instant invention shows that LDLs, but not other 
lipoproteins, form a complex with DNA. 

Herein, synthetic analogues of regions of DNA have been shown to bind to highly 
purified preparations of human, rat, and baboon LDL but not to other human lipoproteins such 
as VLDL and HDL, nor to mouse lipoproteins. In fact, the differences observed among the four 
species tested suggests that human, rat. and baboon lipoproteins behave very similarly in terms 
of DNA binding preference. Further, purified preparations of human, rat, and baboon LDLs are 
shown to complex with the promoter region of the human cytomegalovirus. Thus, the present 
invention demonstrates that human LDL complexes with specific regions of genomic DNA. 
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Because lipoproteins have specific cell membrane receptors and are actively and 
specifically internalized by many different cell types in mammals, and because the inventors 
show that LDL can bind DNA, these lipoproteins can be used as gene delivery vectors. More 
specifically, this invention relates to materials and methods for the use of lipoproteins, such as 
LDL, or, for example, apolipoproteins such as, but not limited to, apoB-100, apoAl, apoE, 
apoAIV, and apoC, or more specifically still, the DNA binding regions of these Upoproteins, as 
gene delivery vectors in vivo. As explained in greater detail below, the various embodimems of 
this invention include, but are not limited to, the delivery of nucleic acids to a cell in the form of 
an LDL-lipoprotein complex, the specific delivery of DNA to the nucleus, and the specific 
localization of delivered DNA to specific nuclear sites. 



Plasma levels of DNA increase in a variety of chronic diseases including lupus 
erythremaiosis (Steinman. 1984), viral hepatitis (Neurath ei al, 1984). and a variety of cancers 

15 (Leon et al., 1977; Shapiro et al., 1983; Stroun et aL, 1987; Nawroz et al., 1996; Anker et al., 
1997; Chen et al., 1996). It further has been shown that lipoproteins in the blood of non-tumor 
carrying organisms are not bound to nucleic acids. However, cancer-carrying individuals, and 
in particular individuals with metastatic cancers, relea.se large amounts of nucleic acids, into 
the blood. Thus, this invention also relates to the observation that lipoproteins in the blood of 

20 cancer patients and especially metastatic cancer patients are bound to nucleic acids, including 
DNA. Accordingly, this invention also may be used to provide a simple screening test for the 
presence or absence of cancer, especially metastatic cancer, by isolating a patient's lipoproteins 
and determining whether the lipoproteins are bound to nucleic acids; the presence of 
lipoprotein-bound nucleic acid being correlative with the presence of cancer and/or metastatic 

25 cancer in the living body. Further embodiments of the present invention relate to the sequence 
specific detecuon of DNA bound to lipoproteins in a cancer patient as a method for the 
identification of specific types of cancer in a living body. These and other aspects of the 
present invention are discussed in greater detail below. 
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1. LIPOPROTEINS 

Lipoproteins appear as micro-pseudomicellar particles in the blood plasma of all 
mammalian species including humans. Their major function is to transport lipids and other 
hydrophobic compounds (i.e., fat-soluble vitamins) through the aqueous environment of the 
blood stream to their specific target cells. The transported lipids can be used as a major 
substrate for energy metabolism ii.e., triglycerides), structural components for cell membranes 
(i.e., phospholipids and cholesterol), or as precursors for steroid hormones and bile acids {i.e., 
cholesterol). Although, lipoproteins vary widely in size and lipid content, they have a common 
general structure. Lipoprotein particles are believed to be spherical and consist of a 
hydrophobic core containing nonpolar lipids surrounded by a hydrophilic surface monolayer of 
polar lipids and proteins, which are called apolipoproteins. 

Plasma lipoproteins may be separated into five major classes based on their density, 
size, and compositional and functional properties: 1) chylomicrons. 2) very low density 
lipoproteins (VLDL), 3) intermediate lipoproteins (IDL). 4). low density lipoproteins (LDL), 
and 5) high density lipoproteins (HDL). The different classes of lipoproteins show distinct 
compositional differences in apolipoprotein content. The specific role of each class of 
lipoproteins in lipid metabolism is determined by the interaction of these apolipoproteins with 
specific enzN'mes and cellular receptors. 

a. ApoBlOO Structure and Function 

The major protein constituem of LDL is apoB-lOO. ApoB-100 is one of two known 
natural ligands for the LDL (apoE/apoB) receptor which is found on the surface of a wide 
variety of mammalian cell types (Brown and Goldstein, 1986). LDLs are taken up by a process 
called receptor-mediated endocytosis (Brown and Goldstein, 1986). Hence, lipoproteins may 
be able to function as naturally-occurring liposomes which contain protein constituents that can 
bind specifically to nucleic acids and can be internalized by a wide variety of eukaryotic cell 
types via specific receptor mediated processes. 

Human apolipoprotein B-lOO (apoB-100) is a major apoprotein component of very-low 
density lipoproteins (VLDL), imermediate density lipoproteins (IDL), low density lipoproteins 
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(LDL), and lipoprotein[a] (Lp[a]). ApoB-100 is synthesized and incorporated into VLDL and 
Lp[a] by the liven Human LDL can be described as a spherical particle composed of a 
hydrophobic core of cholesterol esters and triglycerides encapsulated by an amphipathic 
monolayer of phospholipids, glycolipids and cholesterol in which the apoB-100 is partially 
imbedded (Myant, 1990). In addition to one molecule of apoB-100, LDL is known to contain 
varying numbers of apo C-L apo C-II, apo C-IIl, apo E, and apo D (Blanco- Vaca ei al, 1992; 
Connelly et ai, 1993; Blanco- Vaca et ai. 1994). 

The primary structure of apoB-100, SEQ ID NO:l (FIG. lA-FIG. IC) has been 
determined by amino acid sequence analysis (Yang ei ai. 1986: Yang ei al, 1989) and inferred 
from its cDNA sequence (Yang et uL. 1986; Yang et ai, 1989; Knott ef oL, 1986). There 
appear to be several different isoforms of apo B-100. The isoform shown in FIG. lA-FIG. IC 
is the isoform used for all of ihe alignments in the specification. Homologous regions in the 
other isoforms. however, would align similarly. 

The apparent molecular weight of apoB-100 is 512 kDa based on its amino acid 
composition of 4536 residues. I he apoprotein contains 25 Cys residues (Coleman et al, 1990; 
Yang, 1990), at least 16 of which form intramolecular disulfide bonds, with the remaining 
cysteines present as free sulfhydryls, as additional (unassigned) intramolecular disulfides, or as 
intermolecular disulfide linkages to other apolipoproteins (Blanco-Vaca et ai, 1992; Connelly 
et al. 1993). Several important functional regions on apoB-100 that have been identified 
include heparin-binding sites (Cardin et ai, 1987; Weisgraber and RalL 1987), glycosylation 
sites (Knott et al, 1986; Innerarity et al, 1986), and the LDL receptor-binding region (Blanco- 
Vaca et ai, 1992, Knott et ai, 1986. Milne et ai, 1989). 

ApoB-100, and apolipoprotein E (apoE), apolipoproteins present in the low-density 
lipoprotein group, function as ligands for the high-affinity receptor-mediated removal of certain 
lipoproteins from plasma by the liver and delivery of cholesterol and cholesterol esters to a 
variety of target tissues (Myant, 1990; Innerarity et al, 1986; Brown and Goldstein, 1986; 
Mahley, 1988). A general mechanism for the receptor mediated uptake of LDL is well- 
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established (MyanU 1990; Innerarity et al. 1986; Bro\vn and Goldstein, 1986; Mahiey, 1988), 
and the role of the apoB-100 molecule in this mechanism also is well defined. 

Specific binding of low denshy lipoproteins to their mammalian cell receptors depends 
5 on the presence and conformation of the apoB-100 Hgands (Kinoshita et ai, 1990). Several 
reports have shown that the interaction of apoB-lOO-lipoproteins with the up-regulated, high 
affinity LDL (apoB/apoE) receptor is modulated by the lipid composition of the particle (Teng 
et ai, 1985; Marcel et al. 1988). by other apoproteins such as apo[al in Lp[a] (Kostner and 
Grillhofer, 1991; Young ct ai. 1986) and apoE in p-VLDL (Innerarity et ai, 1986; Mahiey, 
10 1988). and by monoclonal antibodies to specific regions of the apoB-100 molecule (Innerarity 
et ai. 1986: Young ei ai. 1986). 

In searching the apoB-100 sequence for regions of sequence similarity to other proteins, 
nucleic acid binding regions (deoxyribonucleic acids, DNA and ribonucleic acids, RNA), 
15 nucleotide-binding regions, and nuclear-localization regions in the amino acid sequence of 
apoB-100 and apoE, have been identified. The present invention demonstrates that highly 
purified preparations of human, rat. and baboon LDL bind specifically to pure preparations of 
human genomic DNA. These properties impart to the lipoproteins the capacity to serve as 
delivery vehicles for genetic material. 

20 

Lipoprotein particles carry a variety of vitamins and steroid compounds in their pseudo- 
micelle lipid core which may function in the control of gene expression. These attributes impart 
to the lipoproteins a virus-like character as well as capacity. While the inventors do not wish to 
be bound by any particular theory, the many control elements and signal motifs in the primary 

25 structure of the apolipoproteins are suggestive of the ability of these proteins to transport 
nucleic acids, enter the cell, participate in signal transduction, enter the nuclear space, inifiate 
incorporation of nucleic acid materials into the resident genome, and cause its subsequent 
expression. As used herein, the tenn "primary structure" refers to the amino acid sequence of 
the protein. The capacity of purified LDL to bind to human genomic DNA, along with apoB- 

30 lOO's homology to SHI, SH2. and SH3 signal transducer domains supports this hypothesis. 
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These properties of apoBlOO, and methods of exploiting these properties, are discussed in 
further detail below. 

2. NUCLEIC ACID BINDING REGIONS 
5 The inventors have found that apo B-lOO is also involved in DNA binding. DNA is the 

genetic blueprint that contains the information necessary for cell growth, differentiation, 
proliferation, and cellular response to envirorunental factors. The phenotypic differences 
between various cell types in higher eukaryotes are mainly due to differences in cellular gene 
expression. 

10 

The regulation of gene expression is predominantly controlled at the stage of initiation 
of transcription and is mediated by proteins which recognize specific DNA sequences. In order 
to recognize and bind to a specific DNA sequence a protein utilizes a structural motif. Over the 
past 15 years, several structural DNA binding motifs have been identified including as zinc 
15 fingers, helix-tum-helix. basic helix-loop-helix, KH RNA-binding motifs and leucine zippers 
and proline pipe helices. The inventors report here the identification of regions in apo B-lOO 
with homology to various DNA binding motifs including: 1) Proline pipe helix DNA binding 
motifs. 2) ISGFSy-like DNA binding motifs, 3) SREBP-like DNA binding motifs, 4) coiled-coil 
motifs, and 5) nucleotide (ATP)-binding motifs. 

20 

a. Nucleotide and ATP Binding Motifs 

The inventors discovered that that there is a certain degree of homology between regions 
of apo B-lOO and known ATP binding motifs found in other proteins including those involved 
in signal transduction and transcriptional-ribonucleotide synthesis (t-RNA synthetases. 
25 Typically, these proteins contain sites which interact with different regions of the nucleotide, 
i.e., negatively charged phosphate regions, the ribose (carbohydrate) hydroxy i groups, and the 
base. A second site binds to the substrate ligand such as any amino acid in the case of t-RNA 
synthetases and tyrosine, serine and threonine residues in the phosphorylation of proteins. 

30 Examination of the apoB-100 primary structure reveals several regions which are similar 

in sequence to the known nucleotide and ATP binding motifs and are suggestive of a similar 
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fiinction. For example, ATP-binding sites are known to contain an essential ATP-binding 
lysine residue. In lyn, the site is Tz^gKVAVTLKPG (SEQ ID NO:54) and in lyk, it is 
D386KVAIKTIREG (SEQ ID NO:55). A similar region can be found in apoB-100, 
DLNAVANKIAD (SEQ ID NO:56). The similarity of this region in apo B-lOO with the ATP- 
binding sites on known tyrosine-kinases suggests that apo B-lOO can bind to the nucleic acid, 
ATP. 

A single ATP-binding region occurs between residues 3800 and 3840 which is located 
in the kinase domain of apoB-100. The sequence of this region with known ATP-binding 
regions of kinases is shown in FIG. 12A-F1G. 12C. FIG. I2A-F1G. 12C show a comparison of 
known ATP-binding loop motifs to similar regions in apo B-100. Bold letters indicate 
conserved amino acids, critical amino acids (H and K) are indicated by the #, "*" indicates 
conserved amino acids, "-" indicates gaps introduced in the sequence in order to align the 
proteins, and identical amino acids between the sequences in "C" are listed below the 
alignment. Sequence identification numbers are listed in the right margin. The critical lysine 
residue is retained and the degree of similarity suggests a like function. 

The ATP-binding motifs typical of t-RN A synthetases are characterized by the signature 
sequence HIGH (histidine, isoleucine, glycine hislidine) SEQ ID NO: 177, and a second motif 
which contains a critical lysine residue. These motifs are located within 300 residues and occur 
as proximal loops on the surface of the protein molecule. Several analogues of this signature 
sequence occur in the apoB-100 sequence (see FIG. 7 and FIG. 12A-FIG. 12C). An extended 
comparison of apoB-100 regions which contain the HIGH signature sequence is made with the 
tyrosyl-tRNA synthetase sequence shown in FIG. 12A-FIG. 12C. 



b. Proline Pipe Helix Structures 

The proline pipe helix is usually present in proteins that contain proline every fifth 
position (Myant, 1990) in the amino acid sequence that is at least 20 residues long (PXXXXP)„ 
(SEQ ID NO:75) where n>4. In the proline pipe helix. 5.56 residues are required to make one 
complete left handed helical turn. The proline pipe helix is stabilized by a hydrogen bonding 
network between the C=0 groups of residues in positions i+ I, i+2, i+3 (where i is a proline or 
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sometimes non-proline residue) with the NH groups in positions i+2, i+3, i+4, respectively, of 
the following turn (Blanco-Vaca et ai, 1992). The unusually large turn of the helix results in 
the formation of a channel running along the helix that is about 6A in average diameter (Myant, 
1990) and large enough to accommodate water (Blanco-Vaca et ai, 1992) and possibly other 
molecules. 

One function of the proline pipe helix is DNA binding. For example, the proline pipe 
helix in Tus is involved in tight binding to highly specific 22-23 base pair DNA known as Ter 
sites (Connelly et ai, 1993; Blanco-Vaca et ai, 1994). Because of its large diameter compared 
to the a-helix. the proline pipe helix spans the entire width of the major groove (Blanco-Vaca et 
ai. 1992) and results in a tight and highly specific fit. This tight fit also results in a high 
correspondence between the positively charged amino acid residues of the proline pipe helix 
and the negatively charged phosphate groups of DNA (Blanco-Vaca et ai, 1992). The 
occurrence of the proline pipe-DNA interactions in nature might be more widespread than 
presently thought and this interaction might play a very important biological function. 

Careful examination and analysis of the apoB-100 amino acid sequence shows that the 
40-residue proline-rich segment P2682-I2719, or a portion of this segment, assumes a proline 
pipe helical conformation (see FIG. 7), PDFRLPEIAIPEFIIPTLNLNDFQVPDLHIPEFQ 
LPHISH (SEQ ID NO:76). Because the unique features of the proline pipe helix make it 
suitable for tight and highly specific DNA binding, this segment or motif in apoB-100 
constitutes one of the DNA binding sites. 

The ftinctional implications of DNA binding by apoB-lOO include, but are not limited 
to: 1) binding of DNA such as, for example, microsatellite DNA (Connelly et ai, 1993; 
Blanco-Vaca et ai, 1994) to apoB-100 or its fragment(s) for DNA transport from the cytoplasm 
to the nucleus; (2) binding of apoB-100 or its fragment(s) to the nuclear DNA to regulate 
transcription or effect other functions; or (3) binding of DNA to apoB-100 or its fragment(s) to 
transport DNA from the nucleus to the cytoplasm. Other functions as a consequence of apoB- 
100 DNA binding through the apoB-100 proline pipe helix are not precluded. Therefore, the 
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proline pipe region of apoB-100 constitutes an important target for structure-based drug design 
and deliver)' systems. 

c ISGF3Y-Uke DNA binding motifs 

ISGF3 is a multimeric transcription factor involved in the regulation of transcription of a 
large set of genes. This factor dissociated into two protein components termed lSGF3y and 
lSGF3a. 1SGF37 is a 48 kDa protein that binds DNA recognizing the IFN-stimulated response 
elemem. ISGF3a does not bind DNA. Regions in apoB-100 have been found to be 
homologous to the DNA-binding domain of ISGF37 (FIG. 8A-FIG. 8D and FIG. 9A-FIG. 9D). 

FIG. 8A-FIG. 8D show a homology alignment among one region of the DNA-binding 
protein ISGFjy and similar regions in apo B-100. Basic amino acids are indicated in bold and * 
indicates conserved amino acids between the two regions and V indicates conserved amino 
acids that have switched positions between the two sequences aligned. Sequence identification 
numbers are identified in the legend to the figure. 

FIG. 9A-FIG. 9D show a homology alignment among regions of the DNA-binding 
protein ISGFSy and similar regions in apo B-100. Basic amino acids are indicated in bold, "-" 
indicates gaps introduced in the sequence in order to align the two proteins. Sequence 
identification numbers are identified in the right margin. 



This indicates apoB-100 can bind specific DNA sequences in a manner similar to 



ISGF3y. 



d. SREBP-Like DNA Binding Motifs 

Another region within apoB-100 shows striking resemblance to the DNA binding 
domains of previously identified sterol regulatory element binding proteins (SREBP's; FIG. 
lOA and FIG. lOB). A sequence comparison of the DNA-binding domains of the SREBPl, 
SREBP2, ADDl proteins with similar regions found in apo B-lOO are shown in FIG. lOA 
where basic amino acids are indicated in bold. indicates conserved amino acids, 
indicates gaps introduced in the sequence in order to align the two proteins, and identical amino 
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acids between the two sequence are listed below the alignment. FIG. lOB shows a sequence 
comparison of the DNA-binding domains of SREBPl with various apolipoproteins where basic 
amino acids are indicated in bold, indicates conserved amino acids. indicates gaps 
introduced in the sequence in order to align the two proteins, V indicates conserved amino acids 
that have switched positions between the two sequences aligned, and identical amino acids 
between the two sequences are listed below the alignment. Sequence identification numbers are 
indicated in the legend to the figure. The full line of separates the different 

sequence alignments. 

SREBP's are members of the basic helix-loop-helix-leucine zipper (bH-L-H-Zip) family 
of ttanscription factors and play a major role in the uanscriptional regulation of a number of 
genes mvolved in cholesterol homeostasis as well as lipid biosynthesis. SREBP's contain 3 
segments: 1) an NH2 terminal bH-L-H-Zip DNA binding domain including an acidic 
transcription activating domain; 2) a middle segment containing two membrane spanning 
domains; and 3) a COOH terminal segment. In order for SREBP's to become functionally 
active transcription factors, their NH, terminal domain containing the bH-L-H-Zip region needs 
to be released from the endoplasmic reticulum or nuclear envelope. This process is mediated by 
a sterol-regulated protease. That apo B-lOO, like the SREBP's, binds DNA. 

e. Coiled-coil iMotif (Leucine Zipper) 

The coiled-coil motif (My ant, 1990), sometimes referred to as the leucine zipper 
(Blanco-Vaca et al, 1992), is characterized by two a-hclical chains that wrap around each other 
to form a lefl-handed supercoil. The amino acid sequence of coiled-coil forming proteins is 
characterized by the presence of heptad repeats, that is, three or more repeats of a seven-residue 
sequence where every third and every fourth position in the heptad is occupied by a 
hydrophobic residue (Blanco-Vaca et al., 1992; Comielly et al, 1993; Blanco-Vaca et al, 
1994). The two a-helical chains that form the coiled-coil can align either in parallel or anti- 
parallel oriemation and their stabilities are dependent on the presence of strategically located 
hydrophobic and electrostatic interactions (Yang et al, 1986; Yang et al, 1989; Knott et al, 
1986; Coleman el al, 1990; Yang, 1990; Cardin et al, 1987; Weisgraber and Rail, 1987; 
Innerarity et al, 1986; Milne et al, 1989; Brown and Goldstein, 1986). The most attractive 
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feature of the coiled-coil is that highly specific interactions can be tailored by redesigning this 
relatively simple motif. 

The coiled-coil motif occurs widely in native proteins (Lupas et aL 1991; Cohen and 
Parry, 1986). It plays structural and functional roles in fibrous proteins such as keratin, myosin, 
elast'in, fibrinogen, tropomyosin, e,c. The coiled-coil motif also serves as the dimerizaUon 
domain for a number of transcription factors such .as GCN4 (O'Shea ei ai, 1991 ; EUenberger et 
ai, 1992). GAL4 (Kraulis et ai, 1992; Baleja and Sykes, 1991; Marmorstein el ai, 1992), c 
Fos-c-Jun (Glover and Harrison, 1995), where only the dimeric form binds to DNA and is 
active. It is found in globular proteins, such as tRNA synthetase (Cusack et ai, 1990; Biou et 
ai, 1994). and serves as anchors into the tRNA. Naturally occurring coiled-coils can also be 
found as three-stranded (Bullough et ai, 1994a; Bullough ./ ai, 1994b) or four-stranded 
(Banner et ai. 1987) structures. 

Sequence aligmnent analysis of apoB-100 predicts that there are at least eight coiled-coil 
structures of varying lengths in different regions of its amino acid sequence (FIG. 1 1). FIG. 1 1 
shows a comparison of the primary structures of known coiled-coil regions of DNA-binding 
proteins and analogous regions in apo B-100. Bold letters indicate conserved amino acids. 
Sequence identification numbers are listed in the right margin. 

While the inventors do not wish to be bound by any particular theory, it is likely that 
these coiled-coil domains play very important structural and functional roles that, in turn, are 
vital to the function of LDL. For example, the coiled-coil motif can serve as dimerization or 
multimerization sites that may be important in LDL solubilization or aggregation. The coiled- 
coil motif can also bind DNA, RNA or nucleotides and, therefore, plays a very important role in 
the regulation and energetics of protein synthesis. The coiled-coil motif can also serve as a 
template for transport of molecules within and between the cytoplasm and the nucleus. In 
addiUon, the coiled-coil motif can also serve as a (temporary) reservoir of ligands that may be 
importam in the regulation of the metabolic pathways. This list is by no means exhaustive, but 
demonstrates the biological importance of the coiled-coil motif in apoB-100. 
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The discovery of the coiled-coil molif in apoB-100 and the important biological 
implications of its presence, apoB-100 by itself or as part of LDL, constitutes an important 
target for structure-based drug design, delivery, and diagnostic systems. Coiled-coil forming 
sequence in apoB-100 (as indicated in FIG. 1 1) can be used to design, study and manufacture 
coiled-coil based peptide or protein delivery systems for drugs, radioisotopes, oligonucleotides, 
genes, antigens, antibodies, epitopes for vaccines, sugars, carbohydrate analogs and other 
ligands to specific targets in cells, tissues and organs. Either single strand or multiple strands of 
the apoB-100 coiled-coil forming peptide sequences that can be used as components of or 
attached to the aforementioned ligands either by covalent or non-covalent methods. 

Coiled-coil forming sequences in apoB-100 (FIG. 11), or fragments, analogs, or 
modifications therefore can be used as site-specific targets for the delivery of drugs, 
radioisotopes, oligonucleotides, genes, antigens, antibodies, epitopes for vaccines, sugars, 
carbohydrate analogs and other ligands. Site-specific targeting includes the use of coiled-coils, 
coiled-coil forming peptides, or any functional group that binds to the aforementioned coiled- 
coils sequences in apoB-100. 

3. NUCLEAR LOCALIZATION SIGNALS 

In addition to homology with DNA binding proteins. apoB-100 contains several regions 
that are homologous to known nuclear localization signals (FIG. 13A-F1G. 13E). These signals 
include the NLS from human p53, Abl, and apoJ. FIG. 13A-F1G. 13E show a comparison of 
known nuclear localization signal sequences to similar regions in apo B-100. 

The bipartite nuclear localization signal contains two essential elements comprised of 
basic amino acids, H (histidine), R (Arginine), and K (Lysine) which are required for nuclear 
targeting. The signal motifs starts with two basic amino acids which are then followed by a ten 
to thirty amino acid spacer and a basic duster of five amino acids three of which must be basic. 
Approximately 50% of the known nuclear proteins listed in the protein databases have this 
motif, while less than 5% of non-nuclear proteins have it. FIG. 13A and FIG. 13B show 
sequences in apoB-lOO with the perfect 10 amino acid spacer between the bipartite nuclear 
localization sequence element. 
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There is no strict requirement tor the spacer length other than perhaps flexibility in the 
amino acids, i.e., the dihedral angles. Indeed, there are basic amino acid clusters in the apo B- 
100 molecule that are separated by longer spacers and are nevertheless potential DNA-binding 
regions. FIG. 13C shows sequences in apoB-100 with more or less than 10 amino acids in the 
spacer region between the bipartite nuclear localization sequence element, and FIG. 13D- 
FIG. 13E show sequences in apoB-100 with more or less than 10 amino acids in the spacer 
region between an imperfect "bipartite" nuclear localization sequence element. 

Thus, these regions in apoB-100 are NLS sequences capable of directing DNA to the 
nucleus of a cell. Apolipoproteins present on human LDL can bind to DNA through the DNA 
binding motifs identified herein. The functional bH-L-H-Zip domain within apoB-100 can 
enter the nucleus, following proteolytic release and/or aided by the nuclear localization signal 
domains present within the apolipoproteins. and regulate transcription of the target genes. 

In addition, apo B- 100 appears to be conserved across species. FIG. 14A-FIG. 14J show 
various regions of human apo B-lOO aligned with the sequenced fragments of the apo B-lOO 
from pig, rat. hamster, mouse, chicken and rabbit. Bold and underlined letters indicate 
positively charged, basic amino acids, and indicates gaps introduced in the sequence in order 
to align the proteins; 

4. HOMOLOGY TO SIGNAL TRANSDUCING PROTEINS 

The inventors have found that in addition to homology with nuclear localization signals 
and DNA binding proteins, apoB-lOO molecule has regions of sequence similarity to known 
motifs in a variety of signal transduction molecules. For example, regions of apo B-lOO are 
homologous to src homology 3 (SH3) (FIG. 2A-FIG. 2F), src homology 2 (SH2) (FIG. 3A- 
FIG. 3D) and src homology 1 (SHI) (FIG.4A-FIG. C) kinase domains that are common to 
protein tyrosine kinases of the signal transduction system (Koch et al, 1991; Pawson, 1992; 
Schlessinger, 1994; Margolis, 1992; Waksman at al, 1993; Carpenter, 1992; Ugi el ai, 1994; 
Lowenstein el al, 1992; Guevara, Jr. et ai, 1994), as well as activation regions located at the 
amino-and carboxyl- termini of signal transduction proteins (FIG. 6A and FIG. 6B). 

SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



.24- 



FIG. 2A-F1G. 2F is a homology alignment of SH3-like regions in apo B-lOO with 
known SH3 domains of signal transduction proteins, where indicates conserved amino 
acids, "-" indicates gaps introduced in the sequence in order to align the two proteins, identical 
amino acids between the two sequences are listed below the alignment, and percent similarity is 
indicated in the right margin. This alignment is followed by a table identifying the regions of 
apoB-100 and the various proteins aligned to these regions along with their respective sequence 
identification numbers. 

FIG. 3A-FIG. 3D show a comparison of SH2-like regions in apo B-lOO to known SH3 
domains of signal transduction proteins, where structurally important motifs are indicated by 
double underline, basic amino acids are indicated in bold,"*" indicates conserved amino acids, 
"." indicates gaps introduced in the sequence in order to align the two proteins, identical amino 
acids between the two sequences are listed below the alignment, and percent similarity is 
indicated in the right margin. The alignment is followed by a table identifying the reference 
proteins and regions of apoB-100 in the alignmem along with their sequence identification 
numbers. 

FIG. 4 shows a comparison of the apo B-lOO SHI -like region to SHI kinase domains of 
known signal transduction proteins where basic amino acids are indicated in bold, "*" indicates 
conserved amino acids. "-" indicates gaps introduced in the sequence in order to align the two 
proteins, and identical amino acids between the two sequences are listed above the alignment. 
The alignment is followed by a table identifying the reference proteins and the region of apoB- 
1 00 used for the alignment along with their respective sequence identification numbers. 

FIG. 6A and FIG. B show a homolog alignment of specific regions of apo B-lOO and the 
activation regions located at the amino- and carboxyl- termini of signal transduction proteins 
where"*" indicates conserved amino acids, indicates gaps introduced in the sequence in 
order to align the two proteins, and identical amino acids between the two sequences are listed 
above the alignment. Numbers in parenthesis indicate amino acid residues shown in the 
alignment and sequence identification numbers are listed in the right margin. 
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Discovery of these motifs in the apoB-100 sequences was based on a series of reports 
(Ye e( ai, 1988; Trieu and McConathy, 1990; Trieu ei a/., 1991) which showed that free proline 
inhibited binding of recombinant apo[a] to both Lp[a] and LDL. These results implied that 

5 proline within the apoB-100 sequence interacted with the kringle binding pocket. Molecular 
modeling was used to determine if proline is a ligand for the different apo[a] kringle types 
(Guevara, Jr. et al, 1993). These studies concluded that although free proline can be 
accommodated by the ligand binding site of several apo[a] kringle ts'pes, proline located within 
a polypeptide chain probably does not fit into any of the ligand binding sites of apo[a] kringles. 

10 As an alternative possibility, proline might bind at an allosleric site on the kringle structure 
(Guevara, Jr. et ai. 1993), and thereby alter the ligand binding site of the kringle. A second 
possibility is that apo[a] kringles are not involved at alL but rather that the proline/threonine- 
rich inter-kringle regions (IKR's) associate with specific sites on apoB-100, and thereby enable 
recombinant apo[a] to bind to Lp[a] and LDL. 

15 

a. The SH3 Domain 

The interkringle regions of Apo [a] have homology to 3BP1 (FIG. 5). FIG. 5 shows the 
inter-kringle proline-rich regions of Apo[al compared with the proline rich region of SH3- 
binding protein (3BP1) where the conserved prolines are indicated in bold and indicates 
20 gaps introduced in the sequences in order to align the two proteins. Following the alignments is 
a table identifying the inter-kringle proline-rich regions of Apo[a] and the proline-rich region of 
3BP1 used for the alignment along with their respective sequence identification numbers. 

Apo[a] is a hydrophilic, glycosylated apoprotein that is disulfide-linked to apo B-lOO in 
25 the Lipoprotein[a] particle. The proline-rich hinge between kringle structures of the apo[a] are 
suggestive a ofrole in signaling. Cicchetti a/. (1992) and Ren e/ a/. (1993) described a ten 
amino acid, proline-rich segment of the 3BP-1 protein which binds to an SH3 domain in Abl, a 
non-receptor protein tyrosine kinase involved in signal transduction. The proline-rich IKR's in 
apo[a] (McLean et al, 1987; Guevara, Jr. et ai, 1992), like those in 3BP-1, contain the 
30 sequence PXP (SEQ ID NO:2) which is important for the interaction of these motifs with their 
corresponding SH3 domains. 
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Proline-rich binding proteins (BFs), SH3, and SH2 domains are regulatory domains in 
signaling proteins which mediate enzymatic activity, participate in intracellular protein-protein 
interactions, and bind to activated receptor protein-tyrosine kinases (Koch ct ai, 1991; Pawson, 
1992; Schlessinger, 1994; Margolis, 1992; Waksman e. oL, 1993; Carpenter, 1992; Ugi et al, 
1994; Lowenstein et al., 1992; Guevara, Jr. et al., 1994; Pleiman el aL 1994). The sequence 
similarities noted between apo[a] IKR's and the proline-rich segmem of 3BP-1 suggest a similar 
function for these regions of the apo[dl in non-covalent interactions between apo[a] and apoB- 
100, i.e., binding of a proline-rich region in apo[a] to an SHB-like region in apoB-100. 

In apoB-100, at least 13 regions share high sequence similarities with SH3 domains. 
SH3 domains are found in several signal transduction proteins such as phophatidylinositol-3' 
kinase (PI3K) and the non-receptor tyrosine kinase Abl (see FIG. 1 and FIG. 4). This suggests 
that apo B- 1 00 may have signal transduction properties. 

b. The SH2 Domain 

Many signal transduction proteins and other proteins such as tyrosine phosphatases and 
tensin also contain SH2 domams (Koch uL 1991; Pawson. 1992: Schlessinger, 1994; 
Lowenstein ./ oL 1992). often flanked by SH3 domains. SH2 domains are typically comprised 
of about 100 amino acids. In the signaling process. SH2 domams bind to specific 
phosphoiyrosine motifs of target proteins (Songyang et aL 1993; Escobedo et al., 1991). The 
apoB-100 sequence was examined for presence of SH2-like regions and numerous regions in 
the apoB-100 sequences were found to share some commonalties with SH2 domains of 
signaling proteins (FIG. 3A-FIG. 3D). This suggests that apoB-100 may interact with 
phosphorylated proteins through SH2-like regions. 

c. The SHI Domain 

Typically, signal transduction proteins also contain a kinase domain or src homology 
domain 1 (SHI) which is located in the carboxyl region of the protein and is comprised of about 
300 amino acids (Rudd et al., 1993). SHI domains are highly homologous. Regions of apo B- 
100 have been found that share homology with SHI domains (FIG. 4). In addition, apo B-lOO 
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shares horaolog with the catalytic loop or active site motif in these signaling proteins. For 
example, the active site motif of lyn (EC 2.7.1.1 12) is R359KNYIHRDLRAAN (SEQ ID 
NO:52); a sequence that is highly conserved. An analogous region is found in apoB-100, 
K39,9GTLAHRDFSAE (SEQ ID NO:53). 

Furthermore, apo B-lOO shares amino acid sequence homolog with the activation 
regions located at the amino- and carboxyl- termini of signal transduction proteins (FIG. 6A and 
FIG. 6B). Protein Kinase C and c-AMP-dependent kinase control sites are present at the 
amino-terminus of signal transduction proteins. Tyrosine kinase control sites are located in the 
carboxyl-terminus of these proteins. Typically, there is little sequence homology, at the amino- 
termini, but high homology is common at the carboxyl-termini of signaling protein kmases. 

Regions of homology, within apo B-lOO having sequence similarity to SH3, SH2 and 
SHI domains and other cell signaling proteins, all point to the possibility that apo B-lOO is 
involved in intracellular signaling. 



5. PROTEIN EXPRESSION 

As described above, the inventors have discovered that a particular region of the apoB- 
100 molecule is similar in sequence to the Steroid Regulatory Element Binding Proteins, 
SREBPl and 2 and ADDl. Other regions of the apoB-lOO molecule are similar to specific 
regions in other known DNA binding proteins including, but not limited to ISGF3y, coiled-coil 
regions of GCN4 and hMLKl. and the proline-pipe sequences of Tus. Further, the inventors 
found that the amino acid sequence of apolipoproteins, such as apoB-100 have regions involved 
vnth nucleotide binding and nuclear localization. For example, apolipoproteins such as apoB- 
100 show homology to the SHI kinase domains of protein tyrosine kinases and the HIGH and 
KMSK motif plus critical lysine of tRNA synthetases both known to bind ATP as well as to the 
basic helix-loop-helix motif of sterol regulatory element binding proteins (SREBPs) known to 
localize to the nucleus where they are involved in the regulation of transcription. 
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a. Expression of apoBlOO 

In certain embodiments of the present invention, it will be necessary to obtain apoB 1 00 or 
lipoproteins containing apoBlOO for use as DNA binding compositions. In particular 
embodiments as described herein below, such apoBlOO may be obtained from the lipoprotein 

5 fraction of primate serum. As an alternative to purifying apoB 1 00 from LDL fraction of serum, it 
is possible to generate pure fractions of apoB-100 by recombinant expression of the apoBlOO 
gene. The apoB 1 00 gene can be inserted into an appropriate expression system. The gene can be 
expressed in any number of different recombinant DNA expression systems to generate large 
amounts of the polypeptide product, which can then be purified and used as a DNA binding 

10 composition as described herein. 

In one embodiment, specific amino acid sequence domains of an apoBlOO polypeptide 
having for example, the sequence of SEQ ID NO: 1 , can be prepared. These may, for instance, be 
minor sequence variants of a polypeptide that arise due to natural variation within the population 
1 5 or they may be homologues found in other species. They also may be sequences that do not occur 
naturally but that are sufficiently similar that they function similarly and/or elicit an immune 
response that cross-reacts with natural forms of the polypeptide. 

The nucleotide binding, nuclear localization and signal transduction domains of the 
20 apoBlOO molecule are discussed in detail herein below. Recombinant technologies, well 
known to those of skill in the art, may be used to produce recombinant apoBlOO with one or 
more of these domains having sequences that optimize the DNA binding and/or nuclear 
localization capacities of the molecule. Furthermore, in certain instances it may be necessary to 
"customize" such domains in order to increase binding to a particular DNA sequence whilst 
25 decreasing the binding to other sequences. Alternatively, it may be preferable to alter a 
particular apoBlOO polypeptide, in order to decrease its binding affinity for a particular 
molecule. Accordingly , sequence variants of these domains can be prepared by standard methods 
of site-directed mutagenesis such as those described below in the following section. 
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Amino acid sequence variants of an apoBlOO polypeptide, or particular domains therein 
can be substitutional, insertional or deletion variants. Deletion variants lack one or more residues 
of the native protein which are not essential for function or immunogenic activity. 

Substitutional variants typically contain the exchange of one amino acid for another at one 
or more sites within the protein, and may be designed to modulate one or more properties of the 
polypeptide such as stability against proteolytic cleavage. Substitutions preferably are 
conservative, that is, one amino acid is replaced with one of similar shape and charge. 
Conservative substitutions are well known in the art and include, for example, the changes of: 
alanine to serine; arginine to lysine; asparagine to glutamine or hislidine; aspartate to glutamate: 
cysteine to serine; glutamine to asparagine; glutamate to aspartate: glycine to proline: histidine to 
asparagine or glutamine; isoleucine to leucine or valine; leucine to valine or isoleucine; lysine to 
arginine, methionine to leucine or isoleucine; phenylalanine to tyrosine, leucine or methionine; 
serine to threonine; threonine to serine; tryptophan to tyrosine; tyrosine to tryptophan or 
phenylalanine; and valine to isoleucine or leucine. 

Insertional variants include fusion proteins such as those used to allow rapid purification 
of the polypeptide and also can include hybrid proteins containing sequences from other proteins 
and polypeptides which are homologues of the polypeptide. For example, an insertional variant 
could include portions of the amino acid sequence of the polypeptide from one species, together 
with ponions of the homologous polypeptide from another species. Other insertional variants can 
include those in which additional amino acids are introduced within the coding sequence of the 
polypeptide. These typically are smaller insertions than the fusion proteins described above and 
are introduced, for example, into a protease cleavage site. Alternatively, insertional variants of the 
present invention may be created in which one or more DNA binding domains and nuclear 
localization domain have been added to a native apoBlOO molecule to alter particular 
characteristics of the molecule. 



In one embodiment, major antigenic determinants of the polypeptide are identified by an 
empirical approach in which portions of the gene encoding the polypeptide are expressed in a 
recombinant host and the resulting proteins tested for their ability to elicit an immune response. 
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For example. PGR can be used to prepare a range of cDNAs encoding peptides lacking 
successively longer fragments of the C-terminus of the protein. The immunoprotective activity of 
each of these peptides then identifies those fragments or domains of the polypeptide that are 
essential for this activity. Further experiments in which only a small number of amino acids are 
removed at each iteration then allows the location of the antigenic determinants of the 
polypeptide. 

Another embodiment for the preparation of polypeptides according to the invention is the 
use of peptide mimetics. Mimetics are peptide-containing molecules that mimic elements of 
protein secondary structure. See. for example, Johnson el ai. "Peptide Turn Mimetics" in 
BIOTECHNOLOGY AND PHARMACY. Fezzuto et ai, Eds.. Chapman and Hall, New York 
(1993). The underlying rationale behind the use of peptide mimetics is that the peptide backbone 
of proteins exists chiefly to orient amino acid side chains in such a way as to facilitate molecular 
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interactions, such as those of antibody and antigen. A peptide mimetic is expected to permit 
molecular interactions similar to the natural molecule. 

Successful applications of the peptide mimetic concept have thus far focused on mimetics 
of P-tums within proteins, which are known to be highly antigenic. Likely P-tum structure within 
an polypeptide can be predicted by computer-based algorithms as discussed above. Once the 
component amino acids of the turn are determined, peptide mimetics can be constructed to achieve 
a similar spatial orientation of the essential elements of the amino acid side chains. 

Modification and changes may be made in the structure of a gene and still obtain a 
functional molecule thai encodes a protein or polypeptide with desirable characteristics. The 
following is a discussion based upon changing the amino acids of a protein to create an equivalent, 
or even an improved, second-generation molecule. The amino acid changes may be achieved by 
change the codons of the DN A sequence, according to the following data. 

For example, certain amino acids may be substituted for other amino acids in a protein 
structure without appreciable loss of interactive binding capacity with structures such as, for 
example, antigen-binding regions of antibodies or binding sites on substrate molecules. Since it is 
the interactive capacity and nature of a protein that defines that protein's biological functional 
activity, certain amino acid substitutions can be made in a protein sequence, and its underlying 
DNA coding sequence, and nevertheless obtain a protein with like properties. It is thus 
contemplated by the inventors that various changes may be made in the DNA sequences of genes 
without appreciable loss of their biological utility or activity. 

In making such changes, the hydropathic index of amino acids may be considered. The 
importance of the hydropathic amino acid index in conferring interactive biologic fonction on a 
protein is generally understood in the art (Kyte & Doolittle, 1 982). 
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TABLEl 



Amino Acids 




Aspartic acid 

Glutamic acid 

Phenylalanine 

Glycine 

Histidine 

Isoleucine 

Lysine 

Leucine 

Methionine 

Asparagine 

Proline 

Glutamine 

Arginine 

Serine 

Threonine 

Valine 

Tryptophan 

Tyrosine 



UUA UUG CUA cue CUG CUU 



CCA CCC CCG ecu 



AGA AGG CGA CGC CGG CGU 

UCA UCC UCG UCU 
ACA ACC ACG ACU 
GUA GUC GUG GUU 
UGG 

UAC UAU 



It is accepted that the relative hydropathic character of the amino acid contributes to the 
secondary structure of the resultant protein, which in turn defines the interaction of die protein 
with other molecules, for example, enzymes, substrates, receptors, DNA, antibodies, antigens, 



and the like. 



Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics (Kyte & Doolittle, 1982), these are: Isoleucine (+4.5); 
valine (+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); mediionine (+1.9); 
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alanine (+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); 
proline (-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (- 
3.5); lysine (-3.9); and arginine (-4.5). 

It is known in the art that certain amino acids may be substituted by other amino acids 
having a similar hydropathic index or score and still result in a protein with similar biological 
activity, i.e., still obtain a biological functionally equivalent protein. In making such changes, 
the substitution of amino acids whose hydropathic indices are within ±2 is preferred, those 
which are within ±1 are particularly preferred, and those within ±0.5 are even more particularly 
preferred. 

It is also understood in the art that the substitution of like amino acids can be made 
effectively on the basis of hydrophilicily. U.S. Patent 4.554.101, incorporated herein by 
reference, states that the greatest local average hydrophilicily of a protein, as governed by the 
hydrophilicity of its adjacent amino acids, correlates with a biological property of the protein. 

As detailed in U.S. Patent 4,554,101, the following hydrophilicity values have been 
assigned to amino acid residues, arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± 1); glutamate 
(+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 
proline (-0.5 ± 1); alanine (-0.5); histidine *-0.5); cysteine (-1.0): methionine (-1.3); valine (- 
1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). 

It is understood that an amino acid can be substituted for another having a similar 
hydrophilicity value and still obtain a biologically equivalent and immunologically equivalent 
protein. In such changes, the substitution of amino acids whose hydrophilicity values are 
within ±2 is preferred, those that are within ±1 are particularly prefenred, and those within ±0.5 
are even more particularly preferred. 

As outlined above, amino acid substitutions are generally based on the relative similarity 
of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, 
charge, size, and the like. Exemplary substitutions that take various of the foregoing 
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characteristics into consideration are well known to those of skill in the art and include: arginine 
and lysine; glutamate and aspartate; serine and threonine; glulamine and asparagine; and valine, 
leucine and isoleucine. 



b. apoBlOO Variants 

In order to determine the optimal DNA-binding sequences, recombinant fragments of 
apoB-100 or other apolipoproteins may be used in mobility shift assays or other common 
protein-DNA interaction assays, including, but not limited to, methylation interference assays, 
DNase-I footprinting assays, UV-crosslinking assays, Biotin/Streptavidin affinity systems, or 
screening expression libraries encoding DNA-binding proteins. The recombinant 
apolipoprotein fragments are expressed by cloning these cDNA fragments m commercially 
available eukaryolic expression vectors and employing recombinant DNA expression 
techniques well known to the art. 

In addition, the recombinant fragments may be mutated by employing site-directed 
mutagenesis or oligonucleotidc-directed mutagenesis techniques in order to improve their 
affinity for nucleic acids and used either in their original or mutated form. MutaUons in the 
recombinant apolipoprotein fragments may include, but are not limited to, addition of 
endosomolytic and/or nuclear localization peptide sequences employing common recombinant 
DNA technology. The recombinant protein fragments are prebound to the nucleic acids of 
interest prior to their reassembly into freshly isolated lipoproteins and subsequent transfection. 
Alternatively, they are reassembled into lipoproteins prior to in vitro nucleic acid binding and 
subsequent transfection. Separation of protein-bound DNA from free DNA may be required 
prior to transfection and is accomplished by adsorption to nitrocellulose membranes or other 
common techniques including, but not limited to size-exclusion or density ultracentrifugation. 

Site specific mutations can be made within the proposed DNA binding motifs or nuclear 
localization signal sequences of the apolipoproteins described in this invention, in order to 
improve their homology with known DNA binding motifs {e.g., SREBP-like DNA-binding 
motifs, ISGF3y-like DNA-binding motifs) and nuclear localization signal sequences (e.g., NLS 
from human p53. Ap 1, IGFBP-S, ir, and apo J). Specific mutations in the DNA sequences of 
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steroid regulatory elements (SRE) and IFN-stimulaled response elements which affect the 
DNA-binding affinity of SREBP and ISGFSy, respectively, have been described (Smith et al, 
1990; Briggs et al., 1993; Wang ei al., 1993; Veals e( al., 1992). 

Site-specific mutagenesis is a technique useful in the preparation of individual peptides, 
or biologically functional equivalent proteins or peptides, through specific mutagenesis of the 
underlying DNA. The technique further provides a ready ability to prepare and test sequence 
variants, incorporating one or more of the foregoing considerations, by introducing one or more 
nucleotide sequence change(s) into the DNA. Site-specific mutagenesis allows the production 
of mutants through the use of specific oligonucleotide sequences which encode the DNA 
sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to 
provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on 
both sides of the deletion junction being traversed. Typically, a primer of about 17 to 25 
nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of 
the sequence being altered. 

In general, the technique of site-specific mutagenesis is well known in the art. As will 
be appreciated, the technique typically employs a bacteriophage vector that exists in both a 
single stranded and double stranded form. Typical vectors useful in site-directed mutagenesis 
include vectors such as the M 1 3 phage. These phage vectors are commercially available and 
their use is generally well known to those skilled in the art. Double stranded plasmids are also 
routinely employed in site directed mutagenesis, which eliminates the step of transferring the 
gene of interest from a phage to a plasmid. 

In general, site-directed mutagenesis is performed by first obtaining a single-stranded 
vector, or melting of two strands of a double stranded vector which includes within its sequence 
a DNA sequence encoding the desired protein. An oligonucleotide primer bearing the desired 
mutated sequence is synthetically prepared. This primer is then annealed with the single- 
stranded DNA preparation, and subjected to DNA polymerizing enzymes such as E. coli 
polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing 
strand. Thus, a heteroduplex is formed wherein one strand encodes the original non-mutated 
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sequence and the second strand bears the desired mutation. This heteroduplex vector is then 
used to transform appropriate cells, such as E. coli cells, and clones are selected that include 
recombinant vectors bearing the mutated sequence arrangement. 

The preparation of sequence variants of the selected gene using site-directed 
mutagenesis is provided as a means of producing potentially useful species and is not meant to 
be limiting, as there are other ways in which sequence variants of genes may be obtained. For 
example, recombinant vectors encoding the desired gene may be treated with mutagenic agents, 
such as hydroxylamine, to obtain sequence variants. 

6. PURIFICATION OF LIPOPROTEINS 

The purification of plasma LDL involves obtaining a composition of Lp(a) and 
subjecting the composition to reductive cleavage in a manner that allows the formation of 
cleavage products apo (a) and apoB 100. These products are then separated to yield purified apo 
BIOO. Plasma lipoproteins may be isolated using standard sequential flotation 
ultracentrifuaation methods as described (Schumaker and Puppione. 1 986). 

a. Purification of Lp(a) 

Lp(a) is known to be made in the liver of primates. The LDL and VLDL in the plasma 
represents the primary source for the purification of Lp(a). Plasma may be collected from any 
primate source for the purposes of the invention, or indeed any other source suspected of 
possessing Lp(a). The Lp(a) component of the plasma can then be separated from other 
components of the plasma using ultracentriftigational flotation at a density of 1.21 g/mL for 20 
hours at 50, OOOrpm followed by affinity chromatography using lysine-SepharoseTw. Of course, 
the ultra centrifugational procedure is only exemplary and those of skill in the art will be able to 
vary them according to the particular equipment and study need without undue experimentation. 
The plasma may be supplemented with various inhibitors to prevent the Lp(a) from interactmg 
with LDL components of the plasma. 

Having separated Lp(a) from the other plasma components the Lp(a) sample is purified 
using affinity chromatography lysine-SepharoseTw chromatography. This separation is 
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described in detail in PCT publication WO 97/ 173 7 K specifically incorporated herein by 
reference. 

In some cases, it is desirable to use a method other than lysine-Sepharose™ 
chromatography for the purification of Lp(a), in such instances other chromatographic methods 
such FPLC may be employed. Such methods are disclosed in Scanu et al. 1993, incorporated 
herein by reference, and may be used in conjunction with the present invention to purify apo 
BlOO from Lp(a). 

The product purity can be assessed by for example, mobility on. 1% agarose gels, 
Western blots of SDS PAGE, utilizing anti-LDL antibodies. 



b. Isolation of Apo B 1 00 from Lp (a) 
{i) using centrifugation 

Following the purification of Lp(a), the apoBlOO may be separated from the apo A 
fraction of the Lpa complex using reductive cleavage.. The purified imact Lp(a) protein is 
subjected to reductive cleavage wherein a known volume of Lp(a) is incubated with a reductant. 
Exemplary reductants include homocysteine, N-acetyl cysteine. 2-mercaptoethanol, 3- 
mercaptopropionate. 2-aminoethanol, dithiothreitol, and DTE. 

The reaction is incubated at room temperature for 10-20 minutes. This is followed by the 
addition of an inhibitor to prevent non-covalent, lysine mediated interactions between apo (a) and 
apoBlOO. e-Aminocaproic acid (EACA) may be used as such an inhibitor, substituted by other 
lysine analogues, for example, compounds such as trans 4(amino-methyl)-cyclohexanecarboxylic 
acid, N-acetyl-L-lysine, p-benzylamine sulfonic acid, hexylamine, benzamidine, benzylamine, 
L-proline. Of course these are only exemplary lysine analogues and those of skill in the art may 
use other lysine analogues to prevent interaction between apo (a) and apoBlOO proteins. The 
reaction conditions are described in greater detail in PCT publication number WO 97/17371. 
Of course, the conditions for the separation of apo (a) from the reaction mixture using sucrose 
density uhracentrifugationis only exemplary, and other methods commonly used by those of skill 
in the art may be used. 
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(ii) Isolation Using ChromatographicMethods 
As an alternative to the above methods for the isolation of apo BlOO from Lp(a) 
chromatographicmeihods may be utilized as exemplified below. 

Heparin Sepharose™ Chromatography 

Lp(a) may be treated with a reducing agent in the presence of a lysine analogue. For the 
purposes of this invention the lysine analog is supplied to prevent the interaction of apo (a) with 
apoBlOO. The reducing agent is supplied to break the disulfide bond of Lp (a). Lysine analogs 

10 for this invention include but are not limited to compounds such as EACA. trans 4(amino- 
mcthyD-cyclohexanecarboxylic acid. N-aceiyl-L-lysine, p-benzylamine sulfonic acid 
hexylamine. benzamidine. benzylamine. L-proline or any other lysine analogue known to the 
artisan skilled in the art may be used. Example of reducing agents that may be used in this 
invention include, but are not limited to, homocysteine, N-acetyl cysteine. 2-mercaptoethanoL 3- 

1 5 mercaptopropionate, 2-aminoethanol. dithiothreitol, and DTE. 

For example, the mixture of Lp (a), a reducing agent and a lysine analog is incubated for a 
suitable period of time in a suitable buffer of pH 7.4. A heparin-SepharoseTM column is 
equilibrated with a suitable buffer containing the lysine analog and the reducing agent. The 
20 mixture is applied to the equilibrated column, the column is washed with the same buffer and the 
first eluate is collected. 

The first eluate fi-om the column contains the apo (a) dissociated firom Lp (a). The "free" 
apo (a) is dialyzed against an appropriate buffer, the dialysis product is pure apo (a) that may be 
25 freeze dried and stored at -20°C or used immediately. The column is further washed with the 
buffer for a total of three column volumes followed by 3 volumes of 2M NaCI in the buffer. The 
high salt concentration serves to dissociate the remaining unreacted Lp(a) and LDL containing 
apoB 1 00 free of apo (a). 
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Lysine-Sepharose^^ Chromatography 

An alternative to heparin-Sephatose™ chromatography is lysine chromatography. In this 
type of separation, Lp(a) is treated with a suitable reducing agent and then applied to a lysine 
SepharoseTM column that has been equilibrated with a suitable buffer of pH 7.4 containing the 
reducing agent. The column is washed with the same buffer and the first volume of elute is 
collected. This fraction contains LDL dissociated from apo (a). Further details of this type of 
chromatography for separating apolipoproteinsmay be found in PCT Publication WO 97/17371. 

7. SCREENING NUCLEIC ACIDS THAT BIND LDL 

Specifically contemplated by the present inventors are chip-based DNA technologies 
such as those described by Hacia ei al. (1996) and Shoemaker et al. (1996). Chip technologies 
may be used to present DNA arrays for screening. 

In a first embodimeni. chip technologies may be employed to synthesize a variety of 
DNAs in order to test for their binding to an LDL with a specific apoBlOO binding region. 
Briefly, these techniques involve quantitative methods for analyzing large numbers of nucleic 
acids rapidly and accurately. By tagging genes with oligonucleotides or using fixed probe 
arrays, one can employ chip technology to segregate target molecules as high density arrays and 
screen these molecules on the basis of hybridization. See also Pease et al. (1994); Fodor et al. 

(1991). 

Thus, the invention may be applied for the screening of nucleic acids that bind to 
apoBlOO containing lipoproteins. The LDL polypeptide or fragment may be either free in 
solution, fixed to a support, expressed in or on the surface of a cell, for example a bacterial cell. 
Either the LDL polypeptide or the nucleic acid may be labeled, thereby permitting determining 
of binding to the DNA molecules. 

In another embodiment, the assay may measure the inhibition of binding of LDL to a 
natural or artificial substrate or binding partner. Competitive binding assays can be performed 
in which one of the agents (LDL, binding partner or compound) is labeled. Usually, the 
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polypeptide will be the labeled species. One may measure the amount of free label versus 
bound label to determine binding or inhibition of binding. 

Another technique for high throughput screening of compounds is described in WO 
5 84/03564. Large numbers of small test nucleic acids (test compounds) are synthesized on a 
solid substrate, such as plastic pins or some other surface. Similarly, lest compounds of the 
present invention are reacted with LDL and washed. Bound polypeptide is detected by various 
methods. 

10 In an alternative embodiment, the invention may be applied for the screening for 

variants of apoBlOO containing lipoproteins to determine a greater or lesser affinity for a 
particular type of nucleic acid. These screening methods would be similar to those described 
above, except that the LDL peptide variants will be presented as an array with the nucleic acid 
binding regions being used to probe the array. Currently, one of the most widely used 

15 approaches for screening polypeptide libraries is to display polypeptides on the surface of 
filamentous bacteriophage (Smith, 1991; Smith. 1992). Ladner ei al., (U.S Patent No 
5,403.484, specifically incorporated herein by reference) reported the display of proteins on the 
outer surface of a chosen bacterial cell, spore or phage, in order to identify and characterize 
binding proteins. 

20 

In an alternative embodiment, purified apoBlOO or DNA-binding fragments thereof can 
be coated directly onto plates for use in the screening techniques. Alternatively, antibodies to 
the polypeptide can be used to immobilize the polypeptide to a solid phase. Also, fusion 
proteins containing a DNA binding region (preferably a terminal region) may be used to link 

25 peptides to a solid phase. Once linked, randomly sheared genomic DNA, transcripts or 
randomly generated oligomers may be contacted with the bound peptides. Any bound nucleic 
acid fragments can be identified by PGR using random primers if they are large enough. In the 
case where random oligomers are used, the oligomers, in addition to the random region, may 
comprise built in primer binding sites that can be used to amplify an intervening random region, 

30 thereby identifying the region binding to apoB 1 00. 
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Thus, using the technologies described herein, it will be possible for one of skill in the 
art to screen for and isolate a variety of nucleic acids that bind to apoBlOO and variants of 
apoBlOO that exhibit nucleic acid binding capacity, including increased or decreased binding as 
compared to wild-type apoBlOO. 

8. LDL-DNA COMPLEX FORMATION 

In particular aspects of the present invention, lipoproteins are employed in order to 
trasnport DNA into cell in vitro and in vivo. In the present invention, optimal DNA/LDL 
binding has been established. In particular embodiments a 1 :1 ratio of DNAiLDL protein molar 
ratio of 1:1 are incubated at 37 °C for 30 min in a buffered solution. An exemplary buffer may 
be 50 mM Tris-HCl at pH 7.4 containing 150 mN4 NaCl. and 10 mM MgClj. The 
concentrations of DNA and LDL protein may range form the pmolar range to the ^molar range. 
In a preferred embodiment, 0.39 pmole DNA are incubated with 0.39 pmole LDL-prolein. 

The incubation conditions may be altered to increase or decrease the efficiency of 
DNA/LDL binding. For example the incubation may occur at temperatures ranging from 4°C 
to 50"'C, thus it is contemplated that the reaction mixture may be incubated at 4°C, 6°C, 8°C 
10°C, 12°C, 14°C, 16°C, 18°C, 20°C, 22°C, 24°C, 26°C, 28°C, 30°C. 32°C, 34°C. 36°C, 38»C, 
40°C, 42*'C. 44°C. 46°C, 48°C, 50°C. 

The time of incubation may be varied from as little as 10 minutes to as long as 5 hours. 
Thus it is well within the skill of one in the art to incubate the mixture for varying degrees of 
time. 

Other embodiments contemplate varying the concentration of MgC12 in the media. 
Thus the MgC^ concentration may vary from ImM to 100 mM. Thus, it is contemplated that 
the reaction mixture contains 5mM MgClz. lOmM MgClj. 12mM MgCl^, 15mM MgCl^, 20mM 
MgCl2. 30mM MgCl2. 35mM MgClj. 40mM MgCl^, 50mM MgCl> 60mM MgClj, 65mM 
MgCl,. 70mM MgClz. 80mM MgCk, 90mM MgClj, or 1 OOmM MgClz. 
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9. GENE DELIVERY AND EXPRESSION IN EUKARYOTIC CELLS 

The gene delivery system of the instant invention can be used to express any gene of 
interest in eukaryotic cells. The gene or its cDNA sequence is cloned into a plasmid containing 
the specific lipoprotein binding sequences (including, but not limited to SRE, E/C, FAS) and/or 
any eukaryotic regulatory sequence (for example, but not hmited to HCMV, or tyrosine kinase 
promoter region) using DNA cloning techniques well .known to the art. The orientation, 
number and location of the lipoprotein binding sequences may vary within the nucleic acid 
vector, but should not interrupt the protein coding sequence of the gene of interest. 

The gene delivery system of the instant invention (see FIG. 15) can be u.sed to transfect 
eukaryotic cells either m vivo or in vin o with any expression vector containing one or more of 
the aforementioned lipoprotein binding sequences. Expression vectors are designed using 
recombinant DNA cloning techniques known to the art and generally include five components 
linked in the following 5' to 3' orientation: i) an eukaryotic promoter sequence, 2) a sequence 
encoding a 5' untranslated RNA (UTR) which may include a first intron sequence followed by a 
consensus Kozak sequence and an initiation ATG, 3) a protein coding sequence, 4) a 3' UTR, 
and 5) a cognate transcription terminator sequence. 

Lipoproteins are isolated from blood in a manner similar to the previously described 
procedures (see. Example 1 ) and bound to the nucleic acids of interest in a manner similar to the 
previously described DNA binding protocol (see. Example 2). Separation of protein-bound 
DNA from free DNA may be required prior to transfection and can be accomplished by 
adsorption to nitrocellulose membranes or other techniques well known to the art including, but 
not limited to size-exclusion or density ultracentrifugation. 



a) Control Regions 

In order for the gene delivery system of the present invention to effect expression of a 
transcript encoding a selected gene, the polynucleotides encoding these genes will be under the 
transcriptional control of a promoter. A "promoter" refers to a DNA sequence recognized by 
the synthetic machinery of the host cell, or introduced synthetic machinery, that is required to 
initiate the specific transcription of a gene. The phrase "under transcriptional control" means 
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that the promoter is in the correct location in relation to the polynucleotide to control RNA 
polymerase initiation and expression of the polynucleotide. 

The term promoter will be used here to refer to a group of transcriptional control 
modules that are clustered around the inhiation site for RNA polymerase II. Much of the 
thinking about how promoters are organized derives from analyses of several viral promoters, 
including those for the HSV thymidine kinase (tk) and SV40 early transcription units. These 
studies, augmented by more recem work, have shown that promoters are composed of discrete 
functional modules, each consisting of approximately 7-20 bp of DNA, and containing one or 
more recognition sites for transcriptional activator or repressor proteins. 

At least one module in each promoter functions to position the start site for RNA 
synthesis. The best known example of this is the TATA box, but in some promoters lacking a 
TATA box, such as the promoter for the mammalian terminal deoxynucleotidyl transferase gene 
and the promoter for the SV40 late genes, a discrete element overlying the start site itself helps 
to fix the place of initiation. 

Additional promoter elements regulate the frequency of transcriptional initiation. 
Typically, these are located in the region 30-110 bp upstream of the start site, although a 
number of promoters have recently been shown to contain fiinctional elements downstream of 
the start site as well. The spacing between promoter elements frequently is fiexible, so that 
promoter function is preserved when elements are inverted or moved relative to one another. In 
the tk promoter, the spacing between promoter elements can be increased to 50 bp apart before 
activity begins to decline. Depending on the promoter, it appears that individual elements can 
fiinction either cooperatively or independently to activate transcription. 

The particular promoter that is employed to control the expression of a therapeutic gene 
is not believed to be critical, so long as it is capable of expressing the polynucleotide in the 
targeted cell. Thus, where a human cell is targeted, it is preferable to position the 
polynucleotide coding region adjacent to and under the control of a promoter that is capable of 
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being expressed in a human cell. Generally speaking, such a promoter might include either a 
human or viral promoter. 

In preferred embodiments, the human cytomegalovirus (CMV) immediate early gene 
promoter, the SV40 early promoter and the Rous sarcoma virus long terminal repeat can be 
used to obtain high-level expression of the polynucleotide of interest. The use of other viral or 
mammalian cellular or bacterial phage promoters which are well-known in the art to achieve 
expression of polynucleotides is contemplated as well, provided that the levels of expression are 
sufficient to produce a growth inhibitor)' effect. 

By employing a promoter with well-known properties, the level and pattem of 
expression of a polynucleotide following transfection can be optimized. For example, selection 
of a promoter which is active in specific cells, such as tyrosinase (melanoma), alpha-fetoprotein 
and albumin (liver tumors), CCIO (lung tumor) and prostate-specific antigen (prostate tumor) 
will permit tissue-specific expression of the therapeutic gene. 

Enhancers were originally detected as genetic elements that increased transcription from 
a promoter located at a distant position on the same molecule of DNA. This ability to act over a 
large distance had little precedent in classic studies of prokaryotic transcriptional regulation. 
Subsequent work showed that regions of DNA with enhancer activity are organized much like 
promoters. That is. they are composed of many individual elements, each of which binds to one 
or more transcriptional proteins. 

The basic distinction between enhancers and promoters is operational. An enhancer 
region as a whole must be able to stimulate transcription at a distance; this need not be true of a 
promoter region or its component elements. On the other hand, a promoter must have one or 
more elements that direct initiation of RNA synthesis at a particular site and in a particular 
orientation, whereas enhancers lack these specificities. Promoters and enhancers are often 
overlapping and contiguous, often seeming to have a very similar modular organization. 
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Additionally, any promoter/enhancer combination (as per the Eukan'Otic Promoter Data 
Base EPDB) could be used to drive expression of a particular construct. Use of a T3, T7 or SP6 
cytoplasmic expression system is another possible embodiment. Eukaryoiic ceils can support 
cytoplasmic transcription from certain bacteriophage promoters if the appropriate bacteriophage 
5 polymerase is provided, either as part of the delivery complex or as an additional genetic 
expression vector. 

According to the present invention, a number of different promoters are required. It is 
contemplated that these promoters may be the same or different, but the selection of particular 
10 promoters for particular uses may be advantageous. 

b) IRES 

In certain embodiments of the invention, the use of internal ribosome binding site 
(IRES) elements may prove advantageous in accordance with the present invention. These 

15 elements are used to create muhigene, or polycistronic, messages. IRES elements are able to 
bypass the ribosome scanning model of 5' methylated Cap dependent translation and begin 
translation at internal sites (Pelletier and Sonenberg, 1988). IRES elements from tv^^o members 
of the picomavirus family (polio and encephalomyocarditis) have been described (Pelletier and 
Sonenberg, 1988). as v^ell an IRES from a mammalian message (Macejak and Samow, 1991). 

20 IRES elements can be linked to heterologous open reading frames. Multiple open reading 
frames can be transcribed together, each separated by an IRES, creating polycistronic messages. 
By virtue of the IRES element, each open reading frame is accessible to ribosomes for efficient 
translation. Multiple genes can be efficiently expressed using a single promoter/enhancer to 
transcribe a single message. 

25 

Any heterologous open reading frame can be linked to IRES elements. This includes 
genes for secreted proteins, multi-subunit proteins, encoded by independent genes, intracellular 
or membrane-bound proteins and selectable markers. In this way, expression of several proteins 
can be simultaneously engineered into a cell with a single construct and a single selectable 
30 marker. 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCr/US98ni927 

-46- 

In addition, it may be desirable to include polyadenylation signals in the vectors. Tliese 
signals serve to terminate transcription and to stabilize mKNA transcripts produced from the 
vectors. A preferred polyadenylation signal is an SV40 polyadenylation signal. 



c) Genes 

The present invention contemplates the use of a variety of different genes inserted into 
the SV40 vector. For example, genes encoding enzymes, hormones, cytokines, oncogenes, 
receptors, tumor suppressors, transcription factors, drug selectable markers, toxins and various 
antigens are contemplated as suitable genes for use according to the present invention. In 
addition, antisense constructs derived from oncogenes are other ' genes ' of interest according to 
the present invention. 

A common gene currently being used in many gene therapy trials is p53, which 
currently is recognized as a tumor suppressor gene. High levels of mutant p53 have been found 
in many cells transformed by chemical carcinogenesis, ultraviolet radiation, and several viruses. 
The p53 gene is a frequent target of mutational inactivation in a wide variety of human tumors 
and is already documented to be the most frequently-mutated gene in common human cancers. 
It is mutated in over 50% of human NSCLC (Hollstein et ai, 1991) and in a wide spectrum of 
other tumors. Overexpression of wild-type p53 has been shown in some cases to be anti- 
proliferative in human tumor cell lines. Thus, p53 can act as a negative regulator of cell growth 
(Weinberg, 1991) and may directly suppress uncontrolled cell growth or indirectly activate 
genes that suppress this growth. It has also been reported that transfection of DNA encoding 
wild-type p53 into cancer cell lines restores growth suppression control in such cells (Casey et 
al, 1991; Takahasi el al, 1992). It is thus proposed that the treatment of p53-associated 
cancers with wild type p53 in the compositions of the present invention will reduce the number 
of malignant cells or their growth rate. 

pl6'^*^"' belongs to a newly described class of CDK-inhibitory proteins that also includes 
pl6^, p21*'^''', and p27'^"''. The pl6"^'"' gene maps to 9p21, a chromosome region frequently 
deleted in many tumor types. Homozygous deletions and mutations of the pl6'^*' gene are 

fNK4 

frequem in human tumor cell lines. This evidence suggests that the pl6 gene is a tumor 
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suppressor gene. This interpretation has been challenged, however, by the observation that the 
frequency of the pie'^*^" gene alterations is much lower in primary uncultured tumors than in 
cultured cell lines (Caldas et ai, 1994; Cheng et al, 1994; Hussussian et al, 1994; Kamb et ai, 
1994; Kamb et al., 1994; Mori el aL 1994; Okamoto et ai, 1994; Nobori el al, 1995; Orlow et 
ai, 1994; Arap et ai, 1995). Restoration of wild-type pl6"*'*''' function by transfection with a 
plasmid expression vector reduced colony formation by some human cancer cell Imes 
(Okamoto, 1994; Arap, 1995). 

Cell adhesion molecules, or CAM's are known to be involved in a complex network of 
molecular interactions that regulate organ development and cell differentiation (Edelman. 
1985). Recent data indicate that aberrant e.xpression of CAM's maybe involved in the 
tumorigenesis of several neoplasms; for example, decreased expression of E-cadherin, which is 
predominantly expressed in epithelial cells, is associated with the progression of several kinds 
of neoplasms (Edelman and Crossin. 1991; Frixen et al, 1991; Bussemakers et ai, 1992; 
Matsura et ai. 1992; Umbas et al. 1992). Also, Giancotti and Ruoslahti (1990) demonstrated 
that increasing expression of a^P, integrin by gene transfer can reduce tumorigenicity of 
Chinese hamster ovary cells in vivo. C-CAM now has been shown to suppress tumors growth 
in vitro and in vivo. Thus, the compositions of the present invention can be employed to 
mediated C-CAM suppression of tumor cell growth. 

Other tumor suppressors that may be employed according to the present invention 
include RB, APC. DCC. NF-1, NF-2, WT-1, MEN-I. MEN-II, zacl, p73, VHL, MMACl, FCC 
and MCC. Inducers of apoptosis. such as Bax, Bak, Bcl-X„ Bik, Bid, Harakiri, Ad ElB, Bad 
and ICE-CED3 proteases, similarly could find use according to the present invention. 

Various enzyme genes are of interest according to the present invention. Such en2ymes 
include cytosine deaminase, hypoxanthine-guanine phosphoribosyltransferase, galactose- 1- 
phosphate uridyllransferase, phenylalanine hydroxylase, glucocerbrosidase, sphingomyelinase, 
a-L-iduronidase. glucose-6-phosphate dehydrogenase, HSV thymidine kinase and human 
thymidine kinase. 
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In another example, the expression vector may include a nucleotide sequence encoding 
for functional apolipoprotein A-I for the prevention or treatment of artherosclerosis. 
Atherosclerosis is a disease that is characterized by the development of atherosclerotic lesions 
which contain cholesterol esters and other lipids that are derived from the blood circulation. 

5 The plasma concentration of HDL is inversely correlated with the risk for development of 
atherosclerosis. HDL present in the blood circulation take up free cholesterol from extrahepatic 
cells which through the action of LCAT (lecithin-cholesterol acyltransferase) is converted to 
cholesterol esters and stored in the core of the HDL panicles. The HDL cholesterol esters are 
transported either directly or indirectly via transfer to triglyceride rich lipoproteins (i.e., VLDL, 

10 IDL, LDL) to the liver by a process called "reverse cholesterol transport". Reverse cholesterol 
transport is of great importance for maintaining cholesterol homeostasis since the liver is the 
major organ for cholesterol excretion from the body via bile acids. Apo A-1 is the major protein 
constituent of HDL and a cofactor LCAT. Therefore, increasing the plasma concentration of 
apo A-I containing HDL can increase the reverse cholesterol transport and reduce the risk for 

I S atherosclerosis. 

Hormones are another group of gene that may be used in the SV40 vectors described 
herein. Included are growth hormone, prolactin, placental lactogen, luteinizing hormone, 
follicle-stimulating honmone. chorionic gonadotropin, thyroid-stimulating hormone, leptin, 

20 adrenocorticotropin (ACTH), angiotensin I and 11, p-endorphin, jj-melanocyte stimulating 
hormone (P-MSH), cholecystokinin, endothelin I, galanin, gastric inhibitory peptide (GIP), 
glucagon, insulin, lipotropins, neurophysins, somatostatin, calcitonin, calcitonin gene related 
peptide (CGRP). p-calcitonin gene related peptide, hypercalcemia of malignancy factor (1-40), 
parathyroid hormone-related protein (107-139) (PTH-rP), parathyroid hormone-related protein 

25 (107-1 11) (PTH-rP), glucagon-like peptide (GLP-1), pancreastatin, pancreatic peptide, peptide 
YY, PHM, secretin, vasoactive intestinal peptide (VIP), oxytocin, vasopressin (AVP), 
vasotocin, enkephalinamide, metorphinamide, alpha melanocyte stimulating hormone (alpha- 
MSH), atrial natriuretic factor (5-28) (ANF), amylin, amyloid P component (SAP-1), 
corticotropin releasing hormone (CRH), growth hormone releasing factor (GHRH), luteinizing 

30 hormone-releasing hormone (LHRH), neuropeptide Y, substance K (neurokinin A ), substance 
P and thyrotropin releasing hormone (TRH). 
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Other classes of genes that are contemplated to be inserted into the SV40 vectors of the 
present invention include interleukins and cytokines. Interleukin 1 (IL-1), IL-2, IL-3, IL-4, IL- 
5, IL-6, IL-7, IL-8, IL-9, IL-IO, IL-1 1 lL-12, GM-CSF and G-CSF. 

Other therapeutics genes might include genes encoding antigens such as viral antigens, 
bacterial antigens, fimgal antigens or parasitic antigens. Viruses include picomavirus, 
coronavirus, togavirus, flavirviru, rhabdovirus, paramyxovirus, orthomyxovirus, bunyavirus, 
arenvirus, reovirus, retrovirus, papovavirus, parvovirus, herpesvirus, poxvirus, hepadnavirus, 
and spongiform virus. Preferred viral targets include influenza, herpes simplex virus 1 and 2, 
measles, small pox, polio or HIV. Pathogens include trypanosomes, tapeworms, roundworms, 
helminths, . Also, tumor markers, such as fetal antigen or prostate specific antigen, may be 
targeted in this manner. Preferred examples include HIV env proteins and hepatitis B surface 
antigen. Administration of a vector according to the present invention for vaccination purposes 
would require that the vector-associated antigens be sufficiently non-immunogenic to enable 
long term expression of the transgene, for which a strong immune response would be desired. 
Preferably, vaccination of an individual would only be required infrequently, such as yearly or 
biennially., and provide long term immunologic protection against the infectious agent. 

In yet another embodiment, the heterologous gene may include a single-chain antibody. 
Methods for the production of single-chain antibodies are well known to those of skill in the art. 
The skilled artisan is referred to U.S. Patent No. 5.359,046, (incorporated herein by reference) 
for such methods. A single chain antibody is created by fusing together the variable domains of 
the heavy and light chains using a short peptide linker, thereby reconstituting an antigen binding 
site on a single molecule. 

Single-chain antibody variable fragments (Fvs) in which the C-terminus of one variable 
domain is tethered to the N-terminus of the other via a 15 to 25 amino acid peptide or linker, 
have been developed without significantly disrupting antigen binding or specificity of the 
binding (Bedzyk et aL, 1990; Chaudhary et aL, 1990). These Fvs lack the constant regions (Fc) 
present in the heavy and light chains of the native antibody. 
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Antibodies to a wide variety of molecules are contemplated, such as oncogenes, toxins, 
hormones, enzymes, viral or bacterial antigens, transcription factors or receptors. 

d. Antisense 

The instant invention can be used to transfect eukaryotic cells with ribonucleotide 
sequences including anti-sense RNA and ribozymes, that function to inhibit the translation of 
any mRNA of interest, either by direct binding (to the mRNA of interest), or blocking 
deoxyribonucleic acid (DNA) coding sequences preventing transcription. 

Anti-sense RNA inhibits die translation of mRNA by direct binding to the mRNA of 
interest and preventing protein translation, either by inhibition of ribosome binding or the 
translocation of the targeted mRNA molecule which then becomes more susceptible to nuclease 
degradation. 

Antisense methodology takes advantage of the fact that nucleic acids tend to pair with 
"complementar\" sequences. By complementary, it is meant that polynucleotides are those 
which are capable of base-pairing according to the standard Watson-Crick complementarity 
rules. That is. the larger purines will base pair with the smaller pyrimidines to form 
combinations of guanine paired with cytosine (G:C) and adenine paired witii either tiiymine 
(A:T) in the case of DNA, or adenine paired with uracil (A:U) in the case of RNA. Inclusion of 
less common bases such as inosine, 5-methylcytosine, 6-methyladenine, hypoxanthine and 
others in hybridizing sequences does not interfere with pairing. Oncogenes such as ras, myc, 
neu. raf. erb. src. fins. jun. trk. ret. gsp. hst. bcl and abl also are suitable targets for antisense 
constructs. 

Targeting double-stranded (ds) DNA with polynucleotides leads to triple-helix 
formation; targeting RNA will lead to double-helix formation. Antisense polynucleotides, 
when introduced into a target cell, specifically bind to their target polynucleotide and interfere 
with transcription, RNA processing, transport, translation and/or stability. Antisense RNA 
constructs, or DNA encoding such antisense RNA's, may be employed to inhibit gene 
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transcription or translation or both within a host cell, either in vitro or in vivo, such as within a 
host animal, including a human subject. 

Antisense constructs may be designed to bind to the promoter and other control regions, 
axons, introns or even exon-intron boundaries of a gene. It is contemplated that the most 
effective antisense constructs will include regions complementary to intron/exon splice 
junctions. Thus, it is proposed that a preferred embodiment includes an antisense construct with 
complementarity to regions within 50-200 bases of an intron-exon splice junction. It has been 
observed that some exon sequences can be included in the construct without seriously affecting 
the target selectivity thereof. The amount of exonic material included will vary depending on 
the particular exon and intron sequences used. One can readily test whether too much exon 
DNA is included simply by testing the constructs in vitro to determine whether normal cellular 
function is affected or whether the e.xpression of related genes having complementary sequences 
is affected. 

As stated above, "complementary" or "antisense" means polynucleotide sequences that 
are substantially complementary over their entire length and have very few base mismatches. 
For example, sequences of fifteen bases in length may be termed complementary when they 
have complementary nucleotides at thirteen or fourteen positions. Naturally, sequences which 
are completely complementar>' will be sequences which are entirely complementary throughout 
their entire length and have no base mismatches. Other sequences with lower degrees of 
homology also are contemplated. For example, an antisense construct which has limited 
regions of high homology, but also contains a non-homologous region {e.g., ribozyme) could be 
designed. These molecules, though having less than 50% homology, would bind to target 
sequences under appropriate conditions. 

It may be advantageous to combine portions of genomic DNA with cDNA or synthetic 
sequences to generate specific constructs. For example, where an intron is desired in the 
ultimate construct, a genomic clone will need to be used. The cDNA or a synthesized 
polynucleotide may provide more convenient restriction sites for the remaining portion of the 
construct and, therefore, would be used for the rest of the sequence. 

SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCT/US98/11927 

-52- 



e. Ribozymes 

Ribozymes are RNA molecules that catalyze the specific cleavage of RNA. Ribozyme 
activity is mediated through the hybridization of the ribozyme molecule to a specific sequence 
in the target RNA, followed by the endonucleolytic cleavage of the target RNA within that 
sequence. Potential RNA cleavage sites can be identified by searching for specific 
ribonucleotide sequences that include sequences such as GUU. GUC, and QUA within the 
target RNA. Hammerhead motif ribozyme molecules can then be designed that contain short 
RNA sequences (15-25 ribonucleotides) that are complementary to the region including the 
cleavage site of the target RNA. 

Ribozymes are RNA-protein complexes that cleave nucleic acids in a site-specific 
fashion. Ribozymes have specific catalytic domains that possess endonuclease activity (Kim 
and Cook, 1987: Geriach et al., mi; Forster and Symons, 1987). For example, a large number 
of ribozymes accelerate phosphoester transfer reactions with a high degree of specificity, often 
cleaving only one of several phosphoesters in an oligonucleotide substrate (Cook et a/., 1981; 
Michel and Westhof, 1990; Reinhold-Hurek and Shub, 1992). This specificity has been 
attributed to the requirement that the substrate bind via specific base-pairing interactions to the 
internal guide sequence ('1GS'*) of the ribozyme prior to chemical reaction. 

Riboz)'me catalysis has primarily been observed as part of sequence-specific 
cleavage/ligaiion reactions involving nucleic acids (Joyce, 1989; Cook et al., 1981). For 
example, U.S. Patent No. 5,354,855 reports that certain ribozymes can act as endonucleases 
with a sequence specificity greater than that of known ribonucleases and approaching that of the 
DNA restriction enzymes. Thus, sequence-specific ribozyme-mediated inhibition of gene 
expression may be particularly suited to therapeutic applications (Scanlon et a/., 1991 ; Sarver et 
al., 1990). Recently, it vras reported that ribozymes elicited genetic changes in some cells lines 
to which they were applied; the altered genes included the oncogenes H-ras, c-fos and genes of 
HIV. Most of this work involved the modification of a target mRNA, based on a specific 
mutant codon that is cleaved by a specific ribozyme. 
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Since the secondary structure of both target RNA as well as the anti-sense RNA is of 
great importance for the hybridization of both molecules, the predicted structural features can 
be analyzed and RNase protection assays can be used to determine hybridization efficiency. 
Anti-sense RNA and ribozymes can be synthesized employing chemical nucleic acid synthesis 
techniques well known to the art {i.e., solid phase phosphoromidite synthesis) or the RNA 
molecules can be produced by in vitro and in vivo transcription of DNA sequences encoding the 
antisense RNA. DNA sequences encoding ribozymes or anti-sense RNA may be incorporated 
into an expression vector. The expression vector may be prebound to purified plasma 
lipoprotein fractions prior to transfection into eukaryotic cells. 

f. Self-initiating and self-sustaining gene expression systems 
The invention gene delivery system can also be used to delivery self-initiating and self- 
sustaining gene expression systems. Self-initiating and self-sustaining gene expression systems 
may be constructed by binding a RNA polymerase to a DNA construct in vitro prior to the 
introduction of the polynucleotide into the cell as described by Wagner et al. (U.S. Patem No. 
#5.59 1 ,60 1 ). The RNA polymerase is bound to a DNA construct containing a cognate promoter 
of the RNA polymerase operably linked to a DNA sequence encoding for the RNA polymerase. 

The expression of functional RNA polymerase in turn enables the expression of any 
gene of interest that contains a cognate promoter sequence recognized by the same RNA 
polymerase in eukaryotic host cells. DNA sequences encoding for both RNA polymerase and 
gene product of interest {i.e., protein of interest) may be comained within the same gene 
expression system. The gene expression system may be prebound to purified plasma 
lipoprotein fractions prior to transfection into eukaryotic cells. 

g. Delivery of DNA to ceils in vivo 

The invention gene delivery system can also be used to deliver DNA to cells in vivo. An 
expression vector containing the polynucleotide sequences of the gene of interest {e.g., reporter 
gene or a healthy copy of a defective gene) is prebound to LDL according to the protocols 
described herein. This DNA-LDL complex is then introduce into an organism for example, a 
rat, mouse or human by, for example, intravenous injection. At varying times post-injection, 
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LDL is isolated from the blood and probed for DNA sequences of the type that were prebound 
to the LDL using standard molecular biological techniques such as. but not limited to. Southern 
blot hybridization or PCR^'^. 

The LDL also can be immunoprecipitated with anti-LDL antibodies and then probed for 
specific DNA sequences bound to it. In order to determine cellular internalization and/or 
integration of the reporter gene sequences into the genomic DNA of cells of different tissues, 
total genomic DNA can be isolated from various tissues (according to standard molecular 
biology techniques) and probed for the presence of the reporter gene sequences using specific 
polynucleotide probes in PCR^« or Southern blot hybridization techniques. In addition, total 
cellular RNA can be isolated from various differem tissues using standard molecular biology 
techniques and probed for the presence of specific mRNA encoded for by the reporter gene 
polynucleotide sequences using specific antisense polynucleotide probes in Northern blot 
hybridization techniques or ribonuclease (RNase) protection assays. 

Expression of a functional protein encoded for by the gene of interest in different tissues 
can be analyzed using techniques well known to the art, such as. Western blot hybridization of 
cellular protein extracts with antibodies that bind specifically to the reporter gene product (i.e., 
protein of interest) or direct detection of intracellular Huorescence (e.g., when reporter genes are 
used that encode for blue or green fluorescent proteins (e.g., GFP from Clontech Inc.). 

Several non-viral methods for the transfer of a DNA-LDL complex of the present 
invention into cultured mammalian cells also are contemplated by the present invention. These 
include calcium phosphate precipitation (Graham and Van Der Eb, 1973; Chen and Okayama, 
1987; Rippe et ai. 1990) DEAE-dextran (Gopal. 1985), electroporation (Tur-Kaspa et al, 1986; 
Potter et al, 1984), direct microinjection (Harland and Weintraub, 1985), DNA-loaded 
liposomes (Nicolau and Sene, 1982; Fraley et al, 1979) and . lipofectamine-DN A complexes, 
cell sonication (Fechheimer et al, 1987), gene bombardment using high velocity 
microprojectiles (Yang et al, 1990). and receptor-mediated transfection (Wu and Wu, 1987; 
Wu and Wu, 1988). Some of these techniques may be successfully adapted for in vivo or ex 



VIVO use. 
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Once the DNA-LDL complex has been delivered into the cell, the nucleic acid encoding 
the gene of interest may be positioned and expressed at different sites. In certain embodiments, 
the nucleic acid encoding the gene may be stably integrated into the genome of the cell. This 
integration may be in the cognate location and orientation via homologous recombination (gene 
replacement) or it may be integrated in a random, non-specific location (gene augmentation). In 
yet fiirther embodiments, the nucleic acid may be stably maintained in the cell as a separate, 
episomal segment of DNA. Such nucleic acid segments or "episomes" encode sequences 
sufficient to permit maintenance and replication independent of or in synchronization with the 
host cell cycle. How the DNA-LDL complex is delivered to a cell and where in the cell the 
nucleic acid remains is dependent on the type of DNA molecule bound to the LDL. 

In one embodiment of the invention, the DNA-LDL complex may simply consist of 
naked recombinant DNA or plasmids. Transfer of the construct may be performed by any of 
the methods mentioned above which physically or chemically permeabilize the cell membrane. 
This is particularly applicable for transfer in vitro but it may be applied to in vivo use as well. 
Dubensky el ai. (1984) successfully injected polyomavirus DNA in the form of calcium 
phosphate precipitates into liver and spleen of adult and newborn mice demonstrating active 
viral replication and acute infection. Benvcnisty and Neshif (1986) also demonstrated that 
direct intraperitoneal injection of calcium phosphate-precipitated plasmids results in expression 
of the transfected genes. It is envisioned that DNA encoding a gene of interest may also be 
transferred in a similar manner in vivo and express the gene product. 

Another embodiment of the invention for transferring a naked DNA-LDL complex into 
cells may involve particle bombardment. This method depends on the ability to accelerate 
DNA-coated microprojectiles to a high velocity allowing them to pierce cell membranes and 
enter cells without killing them (Klein et ai, 1987). Several devices for accelerating small 
particles have been developed. One such device relies on a high voltage discharge to generate 
an electrical current, which in turn provides the motive force (Yang el ai, 1990). The 
microprojectiles used have consisted of biologically inert substances such as tungsten or gold 



beads. 
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Selected organs including the liver, skin, and muscle tissue of rats and mice have been 
bombarded in vivo (Yang g/fl/., 1990; Zelenin e/ o/.. 1991). This may require surgical exposure 
of the tissue or cells, to eliminate any intervening tissue between the gun and the target organ, 
i.e., ex vivo treatment. Again, DNA encoding a particular gene may be delivered via this 
method and still be incorporated by the present invention. 

In a further embodiment of the invention, the DNA-LDL complex may be entrapped in a 
liposome. Liposomes are vesicular structures characterized by a phospholipid bilayer 
membrane and an inner aqueous medium. Multilamellar liposomes have multiple lipid layers 
separated by aqueous medium. They form spontaneously when phospholipids are suspended in 
an excess of aqueous solution. The lipid components undergo self-rearrangement before the 
formation of closed structures and entrap water and dissolved solutes between the lipid bilayers 
(Ghosh and Bachhawat, 1991). Also contemplated are lipofectamine-DNA complexes. 

Liposome-mediated nucleic acid delivery and expression of foreign DNA in vitro has 
been very successful. Wong el ai. (1980) demonstrated the feasibility of liposome-mediated 
delivery and expression of foreign DNA in cultured chick embryo, HeLa and hepatoma cells. 
Nicolau et ai. (1987) accomplished successful liposome-mediated gene transfer in rats after 
intravenous injection. 

In certain embodiments of the invention, the liposome may be complexed with a 
hemagglutinating virus (HVJ). This has been shown to facilitate fusion with the cell membrane 
and promote cell entry of liposome-encapsulated DNA (Kaneda et ai. 1989). In other 
embodiments, the liposome may be complexed or employed in conjunction with nuclear non- 
histone chromosomal proteins (HMG-1) (Kato et ai, 1991). In yet fiuther embodiments, the 
liposome may be complexed or employed in conjunction with both HVJ and HMG-1. In that 
such expression constructs have been successfully employed in transfer and expression of 
nucleic acid in vitro and in vivo, then they are applicable for the present invention. Where a 
bacterial promoter is employed in the DNA construct, it also will be desirable to include within 
the liposome an appropriate bacterial polymerase. 
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Other DNA-LDL complexes which can be employed to deliver a nucleic acid encoding 
a particular gene into cells are receptor-mediated delivery vehicles. These lake advantage of the 
selective uptake of macromolecules by receptor-mediated endocytosis in almost all eukaryotic 
cells. Because of the cell type-specific distribution of various receptors, the delivery can be 
highly specific (Wu and Wu, 1 993). 

Receptor-mediated gene targeting vehicles generally consist of two components: a cell 
receptor-specific ligand and a DNA-binding agent. Several ligands have been used for receptor- 
mediated aene transfer. The most extensively characterized ligands are asialoorosomucoid 
(ASOR) (Wu and Wu, 1987) and uansferrin (Wagner ei ai. 1990). Recently, a synthetic 
neoglycoprotein, which recognizes the same receptor as ASOR. has been used as a gene 
delivery vehicle (Ferkol ei ai, 1993; Perales et al. 1994) and epidermal growth factor (EGF) 
has also been used to deliver genes to squamous carcinoma cells (Myers, EPO 0273085). 

In other embodiments, the delivery vehicle may comprise a ligand and a liposome. For 
example. Nicolau et aL (1987) employed lactosyl-ceramide. a galactose-terminal 
asialgangiioside. incorporated into liposomes and observed an increase in the uptake of the 
insulin gene by hepatocytes. Thus, it is feasible that a nucleic acid encoding a particular gene 
also may be specifically delivered into a cell type such as lung, epithelial or tumor cells, by any 
number of recepior-ligand systems with or without liposomes. For example, epidermal growth 
factor (EGF) may be used as the receptor for mediated delivery of a nucleic acid encoding a 
gene in many tumor cells that exhibit upregulation of EGF receptor. Mannose can be used to 
target the mannose receptor on liver cells. Also, antibodies to CDS (CLL), CD22 (lymphoma), 
CD25 (T-cell leukemia) and MAA (melanoma) can similarly be used as targeting moieties. 

In certain embodiments, gene transfer may more easily be performed under ex vivo 
conditions. Ex vivo gene therapy refers to the isolation of cells from an animal, the delivery of a 
nucleic acid into the cells in vitro, and then the return of the modified cells back into an animal. 
This may involve the surgical removal of tissue/organs from an animal or the primary culture of 
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cells and tissues. Anderson et al, U.S. Patent 5,399.346, and incorporated herein in its 
entirety, disclose ex vivo therapeutic methods. 



10. PHARMACEUTICAL 

The gene delivery system of the instant invention can be administered in vivo in various 
ways including, but not limited to, intravenous, pharyngeal, epidermal, intramuscular, 
intraperitoneal (IP), nasal, and/or rectal. The gene delivery system of the instant invention can 
also be used for in vitro transfections of eukaryotic cell types which possess specific lipoprotein 
receptors on their cytoplasmic membranes, but is not limited to these cell types. 

Pharmaceutical products that may spring from the current invention may comprise 
naked polynucleotide containing single or multiple copies of the specific nucleotide sequences 
that bind to specific DNA-binding sites of the apolipoproteins present on plasma lipoproteins as 
described in the current invention. The polynucleotide may encode a biologically active 
peptide, antisense RNA. or ribozyme and will be provided in a physiologically acceptable 
administrable form. 

Another pharmaceutical product that may spring from the current invention may 
comprise a highly purified plasma lipoprotein fraction, isolated according to the methodology, 
described herein from either the patients blood or other source, and a polynucleotide containing 
single or multiple copies of the specific nucleotide sequences that bind to specific DNA-binding 
sites of the apolipoproteins present on plasma lipoproteins, prebound to the purified lipoprotein 
fraction in a physiologically acceptable, administrable form. 

Yet another pharmaceutical product may comprise a highly purified plasma lipoprotein 
fraction which contains recombinant apolipoprotein fragments containing single or muhiple 
copies of specific DNA-binding motifs, prebound to a polynucleotide containing single or 
multiple copies of the specific nucleotide sequences, in a physiologically acceptable 
administrable form. Yet another pharmaceutical product may comprise a highly purified 
plasma lipoprotein fraction which contains recombinant apolipoprotein fragments containing 
single or multiple copies of specific DNA-binding motifs, prebound to a polynucleotide 
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containing single or multiple copies of the specific nucleotide sequences, in a physiologically 
acceptable administrable form. 

The dosage to be administered depends to a great extent on the body weight and 
physical condition of the subject being treated as well as the route of administration and 
frequency of treatment. A pharmaceutical composition comprising the naked polynucleotide 
prebound to a highly purified lipoprotein fraction may be administered in amounts ranging from 
1 Jig to 1 mg polynucleotide and 1 jag to 1 00 mg protein. 

Administration of the therapeutic virus particle to a patient will follow general protocols 
for the administration of chemotherapeutics, taking into account the toxicity, if any, of the 
vector. It is anticipated thai the treatment cycles would be repeated as necessary. It also is 
contemplated that various standard therapies, as well as surgical intervention, may be applied in 
combination with the described gene therapy. 

Where clinical application of a gene therapy is contemplated, it will be necessary to 
prepare the complex as a pharmaceutical composition appropriate for the intended application. 
Generally this will entail preparing a pharmaceutical composition that is essentially free of 
pyrogens, as well as any other impurities that could be harmful to humans or animals. One also 
will generally desire to employ appropriate salts and buffers to render the complex stable and 
allow for complex uptake by target cells. 

Aqueous compositions of the present invention comprise an effective amount of the 
compound, dissolved or dispersed in a pharmaceutically acceptable carrier or aqueous medium. 
Such compositions can also be referred to as inocula. The phrases "pharmaceutically or 
pharmacologically acceptable" refer to molecular entities and compositions that do not produce 
an adverse, allergic or other untoward reaction when administered to an animal, or a human, as 
appropriate. As used herein, -pharmaceutically acceptable carrier" includes any and all 
solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption 
delaying agents and the like. The use of such media and agents for pharmaceutical active 
substances is well known in the art. Except insofar as any conventional media or agent is 
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incompatible with the active ingredient, its use in the therapeutic compositions is contemplated. 
Supplementary active ingredients also can be incorporated into the compositions. 

The compositions of the present invention may include classic pharmaceutical 
preparations. Dispersions also can be prepared in glycerol, liquid polyethylene glycols, and 
mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations 
contain a preservative to prevent the growth of microorganisms. 



i) Disease States 

A wide variety of disease states may be treated with compositions according to the 
present invention. In essence, any disease that can be treated by provision of a protein or 
nucleic acid is amenable to this approach. Disease states include a variety of genetic 
abnormalities such as diabetes, cancer, cystic fibrosis and various other diseases that could be 
treated by increasing or decreasing expression of a protein in a target cell. 

Depending on the particular disease to be treated, administration of therapeutic 
compositions according to the present invention will be via any common route so long as the 
target tissue is available via that route. This includes oral, nasal, buccal, rectal vaginal or 
topical. Topical administration would be particularly advantageous for treatment of skin 
cancers. Alternatively, administration will be by orthotopic, intradermal, subcutaneous, 
intramuscular, imraperiioneal or intravenous injection. Such compositions would normally be 
administered as pharmaceuticaily acceptable compositions that include physiologically 
acceptable carriers, buffers or other excipients. 

In certain embodiments, ex vivo therapies also are contemplated. Ex vivo therapies 
involve the removal, from a patient, of target cells. The cells are treated outside the patient's 
body and then returned. One example of ex vivo therapy would involve a variation of 
autologous bone marrow transplant. Many times, ABMT fails because some cancer cells are 
present in the withdrawn bone manow, and return of die bone manrow to the treated patient 
results in repopulation of the patient with cancer cells. In one embodiment, however, the 
withdrawn bone marrow cells could be treated while outside the patient with an LDL-DNA 
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particle that targets and kills the cancer cell. Once the bone marrow cells are 'purged," they can 
be reintroduced into the patient. 

The treatments may include various ' unit doses." Unit dose is defined as containing a 
predetermined-quantity of the therapeutic composition calculated to produce the desired 
responses in association with its administration, i.e., the appropriate route and treatment 
regimen. The quantity to be administered, and the particular route and formulation, are within 
the skill of those in the clinical arts. Also of import is the subject to be treated, in particular, the 
state of the subject and the protection desired. A unit do.se need not be administered as a single 
injection but may comprise continuous infusion over a set period of time. Unit dose of the 
presem invention may conveniently may be described in terms of O.Olmg DNA/kg body weight 
to 0.4mg DNA/kg body weight, with ranges in between these being contemplated such that 
0.05, 0.10, 0.15, 0.20, 0.25, 0.5mg/DNAAcg body weight are administered. Likewise the 
amount of LDL delivered can vary from about 0.2 to about 8.0 mg/kg body weight. Thus in 
particular embodiments, 0.4 mg, 0.5 mg, 0.8 mg, 1 .0 mg, 1 .5 mg, 2.0 mg. 2.5 mg, 3.0 mg, 4.0 
mg, 5.0 mg. 5.5 mg, 6.0 mg. 6.5 mg, 7.0 mg and 7.5 mg of LDL may be delivered to an 
individual in vivo. The dosage of DNA.LDL to be administered depends to a great extent on 
the weight and physical condition of the subject being treated as well as the route of 
administration and the frequency of treatment. A pharmaceutical composition comprising the 
naked polynucleotide prebound to a highly purified lipoprotein fraction may be administered in 
amounts ranging from 1 ^ig to Img polynucleotide to l^ig to lOOmg protein. Thus, particular 
compositions may comprise l^g. S^g, lO^g, 20^g, 30^ig, 40^g, 50tig, 60^g, 70Mg, 80pg, 
lOO^ig, ISOng, 200^g, 250Mg, SOO^g. 600^g. 700Mg, SOOng. 900^g or lOOOng polynucleotide 
that is bound independently to l^ig, S^g, lO^g. lO^xg, 3.0^g, 40jxg SOng. 60ng, 70^g, 80ng, 
lOOng, 150^g, 200ng, 250^g, 500^g, 600^g, 700ng, SOO^ig, 900ng or lOOO^ig, 1.5mg, 5 mg, 
10 mg, 20mg, 30mg, 40mg, 50mg, 60 mg, 70mg, 80 mg. 90 mg or lOOmg lipoprotein. Any 
amoum of polynucleotide may be bound to any other amount of lipoprotein to achieve the 
pharmaceutical concentrations of the present invention. 
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ii) Cancer 

One of the preferred embodiments of the present invention involves the use of the LDL 
vectors to deliver therapeutic genes to cancer cells. Target cancer cells include cancers of the 
lung, brain, prostate, kidney, liver, ovary, breast, skin, stomach, esophagus, head & neck, 
testicles, colon, cervix, lymphatic system and blood. Of particular interest are non-small cell 
lung carcinomas including squamous cell carcinomas, adenocarcinomas and large cell 
undifferentiated carcinomas. 

According to the present invention, one may treat the cancer by directly injection a 
tumor with the LDL vector. Alternatively, the tumor may be infused or perftised with the 
vector using any suitable delivery vehicle. Local or regional administration, with respect to the 
tumor, also is contemplated. Finally, systemic administration may be performed. Continuous 
administration also may be applied where appropriate, for example, where a tumor is excised 
and the tumor bed is treated to eliminate residual, microscopic disease. Delivery via syringe or 
catherization is preferred. Such continuous perfusion may take place for a period from about 1- 
2 hours, to about 2-6 hours, to about 6-12 hours, to about 12-24 hours, to about 1-2 days, to 
about 1-2 weeks or longer following the initiation of treatment. Generally, the dose of the 
therapeutic composition via continuous perfusion will be equivalent to that given by a single or 
multiple injections, adjusted over a period of time during which the perfusion occurs. 

For tumors of > 4 cm, the volume to be administered will be about 4-10 ml (preferably 
10 ml), while for tumors of < 4 cm, a volume of about 1-3 ml will be used (preferably 3 ml). 
Multiple injections delivered as single dose comprise about 0.1 to about 0.5 ml volumes. The 
LDL-DNA panicles may advantageously be contacted by administering multiple injections to 
the tumor, spaced at approximately 1 cm intervals. 

In certain embodiments, the tumor being treated may not, at least initially, be resectable. 
Treatments with therapeutic constructs may increase the resectability of the tumor due to 
shrinkage at the margins or by elimination of certain particularly invasive portions. Following 
treatments, resection may be possible. Addiuonal ureaunents subsequent to resection will serve 
to eliminate microscopic residual disease at the tumor site. 
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A typical course of treaimenu for a primary tumor or a post-excision tumor bed, will 
involve multiple doses. Typical primary tumor treatment involves a 6 dose application over a 
two week period. The two week regimen may be repeated one, two, three, four, five, six or 
5 more times. During a course of treatment, the need lo complete the planned dosings may be 
reevaluated. 

Cancer therapies also include a variety of combination therapies with both chemical and 
radiation based treatments. Combination chemotherapies include, for example, cisplatin 
10 (CDDP), carboplatin, procarbazine, mechlorethamine. cyclophosphamide, ifosfamide, 
melphalan, chlorambucil, bisulfan. nitrosurea. dactinomycin. daunorubicin, doxorubicin, 
bleomycin, plicomycin. mitomycin, eloposide (VP16). tamoxifen, taxoL transplatinum, 5- 
fluorouraciK vincristin. vinblastin and methotrexate. 

15 Combination radiation therapies may be x- and y-irradiation. Dosage ranges for x- 

irradiation ranae from daiiv doses of 2000 to 6000 roentgens for prolonged periods of time (3 to 
4 weeks), to single doses of 2000 lo 6000 roentgens. Dosages for radioisotopes vary widely, 
and depend on the half-life of the isotope, the strength and type of radiation emitted, and the 
uptake by neoplastic ceils. 

20 

Various combinations may be employed, gene therapy is "A" and the radio- or 
chemotherapeutic agent is "B": 

A/B/A B/A/B B/B/A A/A/B A/B/B B/A/A A/B/B/B B/A/B/B 

25 

B/B/B/A B/B/A/B fiJA/B/B A/B/A;^ A/B/B/A B/B/A/A 

B/A/B/A B/A/A/B A/A/A/B B/A/A/A A/B/A/A A/A/B/A 

30 The terms "contacted" and "exposed," when applied to a cell, are used herein to describe 

the process by which a therapeutic construct and a chemotherapeutic or radiotherapeutic agent 
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are delivered to a target cell or are placed in direct juxtaposition with the target cell. To achieve 
cell killing or stasis, both agents are delivered to a cell in a combined amount effective to kill 
the cell or prevent it from dividing. 

5 The therapeutic compositions of the present invention are advantageously administered 

in the form of injectable compositions either as liquid solutions or suspensions; solid forms 
suitable for solution in, or suspension in, liquid prior to injection may also be prepared. These 
preparations also may be emulsified. A typical composition for such purpose comprises a 
pharmaceutical ly acceptable carrier. For instance, the composition may contain 10 mg, 25 mg, 

10 50 mg or up to about 100 mg of human serum albumin per milliliter of phosphate buffered 

saline. 

Other pharmaceutically acceptable carriers include aqueous solutions, non-toxic 
excipients. including salts, preservatives, buffers and the like. Examples of non-aqueous 

15 solvents are propylene glycol, polyethylene glycol, vegetable oil and injectable organic esters 
such as ethyloleate. Aqueous carriers include water, alcoholic/aqueous solutions, saline 
solutions, parenteral vehicles such as sodium chloride. Ringer s dextrose, etc. Intravenous 
vehicles include fluid and nutrient replenishers. Preservatives include antimicrobial agents, 
anti-oxidants. chelating agents and inert gases. The pH and exact concentration of the various 

20 components the pharmaceutical composition are adjusted according to well known parameters. 

Additional formulations are suitable for oral administration. Oral formulations include 
such typical excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, 
magnesium stearate. sodium saccharine, cellulose, magnesium carbonate and the like. The 
25 compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release 
formulations or powders. When the route is topical, the form may be a cream, ointment, salve 
or spray. 

11. EXAMPLES 

30 The following examples are included to demonstrate preferred embodiments of the 

invention. It should be appreciated by those of skill in the art that the techniques disclosed in 
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the examples which follow represent techniques discovered by the inventor to function well in 
the practice of the invention, and thus can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in light of the present disclosure, appreciate 
that many changes can be made in the specific embodiments which are disclosed and still obtain 
5 a like or similar result without departing from the spirit and scope of the invention. 

EXAMPLE 1 
MATERIALS AND METHODS 
1. Isolation of Plasma Lipoproteins 

10 Restriction endonucleases were purchased from Life Technologies, and Protease 

inhibitors {i.e., leupepiin. PMSF. and Trasylol) were purchased from Sigma Chemical 
Company. Plasma lipoproteins were isolated using standard sequential flotation 
ultracentrifugation methods as described (Schumaker and Puppione, 1986). Throughout the 
entire procedure samples were kept on ice or at 4°C unless otherwise stated. 

15 

Subjects were fasted for at least 4 h prior to the start of the experimental procedures. 
Blood was drawn into sterile, vacuumed glass tubes containing anticoagulants, e.g., 0.1% 
(ethylenedinitrolo)-tetracetic acid (EDTA) or heparin. Plasma was obtained by centrifugation 
(10 minutes at 3000 x g) and immediately adjusted to 0.005% phenylmcthansulfonyl fluoride 

20 (PMSF), lOKIU Trasylol/ml, and 1 ng leupeptin/ml. VLDL. LDL. and HDL fracuons were 
isolated by sequential flotation ultracentrifugation for 18 h at 40,000 rpm in a Beckmann 
centrifuge Model LS-80M after plasma samples were adjusted with potassium bromide (ICBr) 
to solution densities of 1.006, 1.019, and 1.215 g/ml respectively. Immediately following 
ultracentrifugation, individual lipoprotein fractions were collected and dialyzed extensively 

25 against phosphate buffered saline (pH 7.4) containing 0.001% sodium azide. Protein 
concentrations were determined using standard BCA protein assays (Pierce Chemical 
Company). 

2. Dna-Binding Protocol 

30 Lipoproteins and DNA were mixed together and incubated for 30 min at room 

temperature in 50 mmole/liter Tris (pH 7.4), 100-154 mmoles/liter sodium chloride (NaCl), 15 
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mmoles/liter magnesium chloride (MgCl^). 6X Sample loading buffer (30% glycerol, 0.25% 
Xylene cyanole FF, 0.25% bromophenol blue) was added to the samples in a 1:5 VA/ ratio. 
Samples were underloaded into 30 ^ll wells at the cathode edge of an 0.8% agarose gel 
containing 1 ^g ethidium bromide/ml in Tris-Acetate buffer (pH 7.85) and electrophoresis was 
accompUshed using 100 Volt constant until die negatively charged tracking dye had migrated at 
least 50% of distance from the loading well to the anodic edge of the gel. 

. 3. Agarose Electrophoretogram of Human Lipoproteins 

Agarose electrophoresis of human lipoproteins has been performed to illustrating the 
differemial migration patterns of lipoprotein fractions VLDL. LDL, and HDL isolated from 
human plasma resolved using non-denaturing conditions. 

Plasma lipoproteins were isolated from human blood according to the protocol described 
above. 6X Sample loading buffer (30% glycerol, 0.25% Xylene cyanole FF. 0.25% 
bromophenol blue) was added to the samples in a 1:5 V/V ratio. Samples were underloaded 
into 30 ii\ wells at the cathode edge of an 0.8% agarose gel in Tris-Acetate buffer (pH 7.85) 
and electrophoresis was accomplished using 100 Volt constam until the negatively charged 
tracking dye had migrated at least 50% of the distance from the loading well to the anodic edge 
of the eel. 

Following electrophoresis, the agarose gel was stained for protein in a solution 
containing 50% VA^ ethanol. 10% VfV acetic acid, and 0.25% Coomasie Brilliant Blue R-250 
(CBB R-250, Bio-Rad Labs). Lane 1 contained human VLDL (10 protein), Lane 2 
contained human LDL (35 protein), and Lane 3 contained human HDL (35 ^ig protein). 
Results illustrated the differential migration of lipoprotein fractions, VLDL. LDL, and HDL, 
isolated from human plasma resolved using non-denaturing conditions by agarose gel 
electrophoresis. Lipoproteins were visualized using a protein binding dye, Coomassie Brilliant 
Blue (CBB). The absence of other bands in each lane indicated the high degree of purity for 
each lipoprotein. 
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4. Radioisotope Labeling of Deoxyoligonucleotides 

Complementary single stranded oligonucleotides were mixed (10 ^g each) and 
incubated at 85°C for 5 min in 10 mM Tris HCl (pH 7.4). Immediately following incubation, 
the samples were cooled down slowly to room temperature to obtain double stranded 
oligonucleotides. The double stranded oHgonucleoUdes were then digested with Bamm and 
£coRI for 1 h at iVC in 50 mM Tris HCl (pH 8-0), 100 mM NAGl, and 10 mM MgCl. 
Digested double stranded oligonucleotides were purified using a Qiaquick nucleotide removal 
kit from Qiagen Inc. according to manufacturer's protocol. The 5' protruding ends of the 
purified oligonucleotides were then labeled with ^WaTP using a Pnme-It 11 labeling kit 
containing Exo (-) Klenow enzyme from Stratagene Inc. according to the manufacturer's 
protocol. The specific activity of all oligonucleotides was determined by scintillation counting. 

The DNA-binding studies were performed as described above except that the agarose 
oel was not stained with ethidium brom.de. Instead, following electrophoresis, the agarose gel 
was dried under vacuum and exposed to X-ray film for 4 h at room temperature prior to protein 
staining in a solution containing 50% V/V ethanol, 10% VA' acetic acid, and 0.25% Coomassie 
Brilliant Blue R-250 (Bio-Rad Labs). Oligonucleotides and human LDL were present at 
400,000 cpm and 40 ^g protein per lane respectively. 

5. Sonication of plasma lipoproteins 

Solutions of plasma lipoproteins in phosphate-buffered saline containing 10 mM MgCb 
were kept on ice and sonicated for various time periods ranging from 0 to 6 minutes in a 
Sonifier Model 350 sonicator (Branson Sonic Power Co.) at the following settings: duty cycle; 
30%, pulsed, output control; level 2. Immediately following sonication, genomic DNA was 
added to the sonicated solutions, and the DNA-binding assay (see above) was started. 

6 RT-PCR^'^ of Lipoprotein-bound RNA 

Human liver RNA. complexed to human LDL or to human VLDL as described above, 
was subjected to agarose gel electrophoresis and extracted from the gel by solubilizing the gel 
for 20 min at 50'C in 3 times the gel volume of QX-1 buffer (Qiagen) and by twice addmg an 
equivalent volume of phenol/chloroform (pH 4.0). RNA was precipitated by adding an 
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equivalent volume of 100% isopropanol and freezing the mixture overnight at -80°C. RNA 
pellets were dissolved in 50 ^1 of DEPC-trealed water. For each reaction, the dissolved RNA (3 
^il) was transcribed in reverse into single-stranded DNA by adding 100 mM KCl, 10 mM Tris- 
HCl (pH 8.3). 5 mM MgClz, 2.5 \M primer (oligo d(T) or random hexamers), 1 U/til RNase 
inhibitor, 1 mM each of dATP, dCTP, dTTP, and dGTP, and 2.5 U/jil of MuLV reverse 
transcriptase in a total reaction volume of 20 ^1. The single-stranded DNA samples were then 
amplified in 100 mM KCl, 10 mM Tris-HCl (pH 8.3). 2 mM MgCl^, 0.15 ^iM each of the 
forward and reverse ISRE primers (see Table 2), 1 mM each of dATP, dCTP, dTTP, and dGTP. 
and 2.5 U/100 \il of AmpUTaq DNA polymerase in a total reaction volume of 100 ^I. DNA 
amplification was carried out in a thcrmocyclcr in 30 consecutive cycles of denaturing at 95 °C 
for 60 sec, reannealing ai 55°C for 60 sec, primer extension at 72°C for 120 sec, and a final 
extension at 72°C for 7 min. For each PCR reaction, 10 jxl of the reaction mixture was analyzed 
by electrophoresis on a 1% agarose gel in TBE buffer (45 mM Tris-borate and 1 mM EDTA, 
pH 8.0) while maintaining a 100-V constant for 1 h. The PCR products were visualized by 
staining the gel with ethidium bromide. 



7. DNA sequencing 

DNA fraements obtained from the RT-PCR reactions were separated by electrophoresis 
on a 1% agarose gel and extracted from the gel by using a Qiagen gel extraction kit according to 
the manufacturer s protocol. DNA samples were analyzed on an Applied Biosystems Inc. 
model 373 automated DNA sequence apparatus after dye-terminator thermo cycle sequencing. 

8. Cell culture and transfection assays. 

Human skin fibroblasts were cultured in complete growth medium consisting of 
Dulbecco's modified Eagle's medium that was supplemented with 10% fetal bovine serum, 100 
^g/ml each of streptomycin and penicillin at 37°C in an atmosphere of 5% COj in a humidified 
incubator. Twenty-four hours before cell transfection, during exponential growth, the cultured 
cells were harvested by trypsinization, replated at a cell density of 1 x 10^ cells in 35-mm 
culture dishes containing a glass coverslip, and cultured in complete growth medium. All 
transfection experiments were performed in triplicate as described. 
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9. LipoFectin assay. 

The pEGFP-Nl plasmid and LipoFectin were mixed together at a ratio of 1 :4 (wt/wt) in 
200 ill of serum-free medium and incubated for 1 5 min at room temperature. When the cells 
reached 40 to 60% confluence, they were transfected with a mixture of 5 ng of DNA and 20 ng 
of LipoFectin per 35-mm cuhure dish, each dish having been diluted in 1 ml of serum-free 
medium. Transfection was performed for 16 h at 37°C. Once transfection was achieved, the 
liposomes were removed from the culture dish by gemle washing and maintained in 2 ml of 
growth medium per 35-mm culture dish for 24 h at 3TC. Expression of GFP in the cells was 
determined by fluorescence microscopy 



10. LDL assay. 

The pEGFP-Nl plasmid and LDL were mixed together at a ratio of 1:10 (wt/wt) in 100 
^il of serum-free medium containing 10 mM MgCl, and incubated for 15 min at 37°C. When 
the cells were 40 to 60% confluent, they were transfected for 16 h at 37°C with a mixture of 5 
iig of DNA and 50 of LDL per 35-mm culture dish, each dish having been diluted in 1 ml of 
serum-free medium. Once transfection was achieved, the LDLs were removed by gentle 
washing and maintained in 2 ml of growth medium per 35-mm culture dish for 24 h at 37°C. 
At 24 h after transfection, the cells were washed with PBS and fixed in 2 ml of PBS containing 
4% paraformaldehyde per 35-mm culture dish for 30 min. The coverslips were then removed 
from the culture dishes, washed with PBS, placed in an inverted orientation on glass slides, and 
examined by fluorescent microscopy to detect GFP. 



II. In vivo reporter gene expression. 

Two-month-old female Sprague-Dawley rats were anesthetized with a combination 
anesthetic (42.8 mg/ml ketamine, 8.6 mg/ml xylazine, and 1 .4 mg/ml acepromazine), and a 
prebound complex of purified rat LDL and linearized pEGFP-Nl plasmid DNA was injected 
intravenously (into the femoral vein), subcutaneously, intraperitoneally, and into the 
pharyngeal, nasal, and rectal mucosae (100 of LDL protein and 5 ng of DNA in 100 ^1 of 
PBS containing 10 mM MgClj per site). Control animals were injected with linearized pEGFP- 
Nl plasmid DNA in which the HCMV IE promoter sequence was interrupted only by digestion 
with restriction enzymes. 5 ^g of DNA in 100 fil of PBS containing 10 mM MgClj per site. 
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After 2, 5, or 7 days, all the treated and control rats were sacrificed, their blood was collected by 
means of cardiac puncture, and the tissues were excised and immobilized in OCT by means of 
snap freezing over liquid nitrogen or by immediate freezing in liquid nitrogen. The 
immobilized tissue samples were sectioned on a cryomicrotome, and the sections (5-8 jxm 
thick) were fixed for 30 min in 4% paraformaldehyde and analyzed for expression of EGFP 
(green fluorescent protein) by fluorescent microscopy. 

12. Fluorescent microscopy. 

Microscopy was performed by using an Olympus Model BH-2 fluorescent microscope 
(Olympus, USA) equipped with a digital camera (Hamamatsu. Model C5810) and a color 
printer (image Master. Toshiba). The filter set used was a standard fluorescein isothiocyanate 
(FITC) set (Chroma Technology. Braitieboro, VT, USA). The maximum excitation and 
emission wavelengths for this filler set were 485 nm (range 460-510 nm) and 540 nm (range 
515-565 nm), respectively. Transfection efficiency was determined by calculating the average 
percentage of transduced cells of five different fields per 35-mm culture dish. 

13. Detection of GFP. 

Excised rat tissues were homogenized in 1 50 lal of PBS in a dounce homogenizer placed 
on ice. The homogenized tissues were centrifuged for 3 min at 13.000 x g, and 50-^1 ahquots 
were withdrawn and used in an ELISA assay to detect GFP. First, serial dilutions (range 1:10 
to 1:1,000) of all samples were made in PBS. ELISA plates (96 wells) were coaled with the 
samples (three wells/sample) by incubating the plates at room temperature for 3 h. The plated 
samples were then washed three times with 200 ^il of 1 x PBS containing 0.1% Tween 20 
(PBST) and blocked with 200 ^il of PBST containing 1% bovine serum albumin (BSA) for 2 h 
at room temperature while shaking gently. The washing procedure was repeated with 200 ^l of 
PBST containing 0.1% BSA, and the plated samples were incubated with a 1:2,000 dilution of a 
recombinant GFP polyclonal antibody (IgG firaction, Clontech Inc., Palo Alto, CA) in PBST 
containing 0.1% BSA (50 \xl of diluted mixture per well) for 18 h at 4°C while shaking gently. 
The plated samples were washed and incubated with a 1:5000 dilution of HRP-conjugated goat 
anti-rabbit antibody (IgG fraction, Cappel, Durham, NC) in PBST containing 0.1% BSA for 1 h 
at room temperature while shaking gently. The washing procedure was repeated and was 
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followed by a final wash with I x PBS. GFP was detected after a 30-min incubation at room 
temperature in PBS containing o-phenylenediamine as a chromogenic substrate. 

EXAMPLE 2 

BINDING OF HUMAN GENOMIC DNA TO HUMAN LDL 

The binding of human genomic DNA (hg DNA) to human LDL has also been 
demonstrated. Each lane of the agarose gel contained hg DNA cut with Alul or HindlU. In 
addition, human VLDL and mouse LDL were run alongside the hg DNA. 

Plasma lipoproteins were isolated from human or mouse blood according to the protocol 
described above. DNA-binding studies were performed using human genomic DNA digested 
with either Alii] or HindlW. Following electrophoresis, the gel was stained for DNA with 
ethidium bromide prior to protein staining in a solution containing 50% VA/ ethanol, 10% VfW 
acetic acid, and 0.25% Coomasie Brilliant Blue R-250 (CBB R-250. Bio-Rad Labs). 

Each lane contained 5 ng human genomic DNA (hg DNA) cut with Alul or HindlU. In 
addition, human VLDL (10 ^ig protein per lane) human LDL (35 ng protein per lane) and 
mouse LDL (10 fig protein per lane) were also analysed. 

Bands in this study showed specific binding of digested human DNA fragments and 
human LDL by gel-shift electrophoresis. DNA fragment obtained by Aliil or HindlU digestion 
of human genomic DNA are shown to migrate toward the anode with much slower mobility 
when preincubated with human LDL but not when incubated with human VLDL, human HDL, 
or mouse LDL. The complexed DNA/lipoprotein band are first visualized using DNA-binding 
ethidium bromide and photographed using transmitted ultra-violet light for activation of the 
fluorescent dye. Lipoproteins were next visualized with CBB and photographed using 
transmitted visible light. The results shown in this figure indicate that aliquoti of Alul- and 
Hind Ill-digested human genomic DNA fragments comigrate with human LDL and are 
therefore bound to human LDL. 

While Aiul, and HindUl were used to digest genomic DNA in the studies shown here, 
the inventors of the instant invention have also used BamUl, and Pvul for genomic DNA digest. 
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It is understood by those of skill in the art that there are many known restriction enzymes. All 
of which are capable of genomic DNA digestion resulting in DNA that can be successfully 
bound to LDL. DNA digested with AM yields DNA of very small size (200-700 nucleotides) 
which allows isolation of the slower migrating digested DNA bound to LDL from the unbound 
digested DNA using agarose gel electrophoresis. Digestion of genomic DNA with HindOl 
yields genomic DNA of greater average size (1000-7000 nucleotides) which reaches the upper 
size limit for separation by agarose gel electrophoresis (the technique used here), however there 
are other known DNA separation techniques which would work similarly to accomplish the 
goal of separating free DNA from DNA bound to LDL. The choice of which separation 
technique to use is dependent only on the size of the DNA fragments resulting after digestion. 
In principal, undigested genomic DNA would also work. 



EXAMPLE 3 
BINDING OF PLASMID DNA TO HUMAN LDL 

Plasma LDL were isolated from human blood according to the protocol previously 
described in E.\ample 1. DNA-binding studies were using DNA (pBluescript 11 KS, Stratagene 
Inc.) digested with Pvu I. Following electrophoresis, the agarose gel was stained for DNA with 
ethidium bromide prior to protein staining in a solution containing 50% VA' ethanol. 10% V/V 
acetic acid, and 0.25% Coomassie Brilliant Blue R-250 (CBB R-250. Bio-Rad Labs). The 
binding of plasmid DNA to human LDL was shown in agel which contained contains 0.5 [ig 
molecular size DNA marker (Lane 1 ); 2 Mg pKS DNA cut with Pvu I (Lanes 2-4); 35 ^g human 
LDL (Lane 3) and 70 \ig human LDL protein (Lane 4). 

Results of the electrophoretogram illustrated specific binding of Pvul digested plasmid 
DNA (pBluescript II KS, Stratagene Inc.) and human LDL. Increased amounts of human LDL 
also caused an increase of DNA shifted to the LDL location and a decrease of the free Pvu I 
digested DNA band. Co-migration of the Pvu I digested DNA and human LDL are proof of a 
physical complex composed of LDL and DNA. 
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EXAMPLE 4 
BINDING OF CMV PROMOTER-REGULATORY 
SEQUENCES TO HUMAN LDL 
Plasma lipoproteins were isolated from human or mouse blood according to the protocol 
previously described in Example 1 . DNA-binding studies were performed using plasmid DNA 
(either pBluescript II KS or pBKCMV, Stratagene Inc.) digested with BamW. Following 
electrophoresis the agarose gel was stained for DNA with ethidium bromide prior to protein 
staining in a solution containing 50% VA^ ethanol. 10% VA^ acetic acid, and 0.25% Coomassie 
Brilliant Blue R-250 (CBB R-250, Bio-Rad Labs). Loading quantities per lane were as follows: 
plasmid DNA: I \ig DNA/lane 

human VLDL 35 ^g protein/lane 

human LDL 35 ng protein/lane 

mouse VLDL: 8 \ig protein/lane 

mouse LDL: 35 fig protein/lane 

This study used BamUl cut pIGS, BamHI cut pBKCMV. human VLDL. human LDL, mouse 
VLDL and mouse LDL. 

A comparison of human LDL complexed with BamH\ linearized plasmids, pBluescript 
II KS or pBKCMV. The inventors' results illustrated that specific binding of 5omHI linearized 
plasmid DNA and human LDL occurs, but these BamU] linearized plasmids do not complex 
with either human VLDL. mouse VLDL or mouse LDL under the conditions previously 
described in the DNA-binding protocol (Example 2). Further, enhanced binding of human LDL 
and the BamUl linearized plasmid pBKCMV DNA which contains the cytomegalovirus 
promoter region SEQ ID NO:225 (Table 2) was observed as compared to the BamHl linearized 
plasmid pBluescript II KS DNA that does not contain the cytomegalovirus promoter region 
(lane 3). Because binding of DNA by LDL is enhanced in the presence of the CMV promoter, 
it is possible that 'LDL binds specifically to the CMV promoter sequence (SEQ ID NO:225, see 
Table 2). 

Aliquots containing approximately 8 mouse VLDL protein were used in each DNA- 
binding assay mixtures resolved in lanes 4 and 9 as compared to 35 \ig of total protein of all 

SUBSTITUTE SHEET (RULE 26) 



PCTAJS98/11927 

WO 98/56938 

-74- 

other lipoproteins (lanes 2, 3, 5, 7. 8, and 10). Due to the low physiological concentration of 
VLDL in mouse plasma and the limited loading capacity of the gel, it was not possible to load 
35 of mouse VLDL protein per lane. Therefore, this study does not allow for a quantitative 
comparison of the plasmid DNA-binding capacity of mouse VLDL vs. human VLDL, human 
LDL, and mouse LDL. 



TABLE 2 

Nucleotide Sequence of the Promoter Region (1300-1900) of the Human Cytomegalovirus 

SEQ ID NO:225 




GGATCTGACG 

ACACGCCTAC CGCC CATTTG CGTCAATGGG GCGGAGTT GT TACGACATTT 

TGGAAAGTCC CGTTGATTTT GGTGCCAAAA CAAACTCC AT TGACGTCAAT 

GGGGTGGAG A CTTGGAAATC CCCGTGAGTC AAACCGCTAT CCACGCCCAT 

T GATGTACTG CCAAA ACCGC ATCACCATGG TAATAGCGAT GACTAATACG 

TA GATGTACT GCCAAGT AGG AAAGTCCCAT AAGGTCATGT ACTGGGCATA 

ATGCCAGGCG GGCCATTTAC CGT CATTGAC GTCAATAGGG GGCGTA CTTG 

GCATATGATA CACTTGATGT ACTGCCAAGT GGGCAGTTTA CCGTAAATAC 

TCCACCCATT GACGTCAATG GAAAGTCCCT ATTGGCGTTA CTATGGGAAC 

ATACGTCAT T ATTGACGTCA ATGGGCGGGG GTC GTTGGGC GGTCAGCCAG 

GCGGGCCATT TA CCGTAAG T TATGTAACGC GGAACTCCAT ATATGGGCTA 

TGAACTAATG ACCCCGTAAT TGATTACTAT TAATAACTA 



Major repeat regions are indicate in bold and underlined. 
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EXAMPLE 5 
BINDING OF SRE, E/C, FAS, AND ISRE 
DEOXYNUCLEOTIDE SEQUENCES TO HUMAN LDL 
Plasma lipoproteins were isolated from human or mouse blood according to the protocol 
previously described in Example I . DNA-binding studies were performed using the synthetic 
oligonucleotides: SRE, E'C. and FAS (see Table 3 for nucleotide sequences). 



TABLE 3 

Deoxvribonucleic Acid Sequences of Synthetic Oligonucleotides 

used in Binding Studies with LDL 

SEQIDNO OligoName Sequence (5 -3) 

226 SRE-2A GATCCAAATCACCCACTGCAACTCCTCCCCCTGCG 

227 E/C- 1 A GATCCATCCAATTGGGCAATCAGGAG 

228 FAS- lA GATCCGGTCTCCAATTGG 

229 ISRE- I A GATCCTCGGGAAAGGGAAACCGAAACTGAAGCCG 



DNA-binding studies were performed according to the previously described DNA- 
binding protocol (Example 2). Following electrophoresis, the agarose gel was Stained for DNA 
with elhidium bromide prior to protein staining in a solution containing 50% VA/ ethanol, 1096 
VA' acetic acid, and 0.25% Coomassie Brilliant Blue R-250 (CBB R-250, Bio-Rad Labs). 
Oligonucleotides were present at 1 ng DNA per lane. Lanes containing human LDL contained 
35 ng protein per lane and lanes containing mouse LDL contained 15 \xg protein per lane. 

The data generated showed the complexed synthetic, double-suranded oligonucleotide 
fragments and human LDL. The results strongly support that human LDL binds to these DNA 
sequences in a highly specific fashion. The synthetic oligonucleotides SRE-2A, E/C-IA, FAS- 
lA, and ISRE-IA (Table 3. SEQ ID NO:226, SEQ ID NO:227, SEQ ID NO:228, and SEQ ID 
NO:229 respectively) bind to human LDL but do not bind to mouse LDL. DNA binding to 
human LDL is illustrated by the appearance of a fraction of slower mobility DNA that 
comigrates with human LDL. 
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In another embodiment of this same study, binding was determined using radioisotope 
labeling of the deoxynucleotide sequences as described in Example 1. The results from these 
DNA-binding studies show that human LDL binds to the synthetic oligonucleotides SRE-2A, 
E/C-IA, FAS-IA, and ISRE-IA (Table 3. SEQ ID NO:226; SEQ ID NO:227; SEQ ID NO:228; 
SEQ ID NO:229) in a highly specific fashion. DNA binding to human LDL is illustrated by the 
appearance of a fraction of slower mobility DNA that comigrates with human LDL. The 
binding affinity of the different synthetic oligonucleotides for human LDL can be determined 
by kinetic binding studies using quantitative autoradiography well known to those of skill in the 



art. 



EXAMPLE 6 

BINDING OF VARIOUS NUCLEOTIDE SEQUENCES TO 
THE LDL ISOLATED FROM VARIOUS SPECIES 

Plasma lipoproteins were isolated from human, mouse, rat, or baboon blood according 
to the protocol previously described in Example 1. DNA-binding studies were performed 
according to the previously described DNA-binding protocol using the synthetic 
oligonucleotides: SRE. E/C. and FAS (see Table 3 for nucleotide sequences), genomic DNA, or 
plasmid DNA containing the CMV promoter. A summary of the binding studies of the instant 
invention are illustrated in Tables 4A and 4B, below. Table 4A illustrates the binding of 
human, mouse, rat and baboon LDL to various forms and sources of DNA, and Table 4B 
illustrates the DNA/LDL complexes made thus far. 



TABLE 4A 

Binding of Human, Mouse, Rat and Baboon LDL to Various Forms of DNA 



DNA human LDL mouse LDL rat L DL baboon LDL 

hgDNA YES NO YES YK 



mgDNA 



N.D. N.D. YES N.D. 
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rgDNA 


N.D. 


N.D. 


YES 


N.D. 


bgDNA 


N.D. 


N.D. 


N.D. 


YES 


CMV 


YES 


NO 


YES 


YES 


SRE 


YES 


NO 


N.D. 


NO 


E/C 


YES 


NO 


N.D. 


NO 


FAS 


YES 


NO 


N.D. 


NO 



hg = human genomic DNA (digested with either Alul or Hindlll, mg = mouse genomic 
DNA digested with either AM or Hindllh rg = rat genomic DNA digested with either AM or 
H'mdllL and bg = baboon genomic DNA digested with either AM or Hindlll 

Yes = binding, NO = no binding, N.D. = binding not determined 
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TABLE 4B 

Specific LDL/DNA Complexes That Have Been Made 



DNA 


DNA Digested With 




human genomic 


AM 


numan 


human genomic 


Hindin 


• 

human 


human genomic 


Bam HI 


human 


human eenomic 


PvmI 


human 


human genomic 


Alul 


rat 


human genomic 


Hindlll 


rat 


human genomic 


BamH] 


rat 


human genomic 


Pvul 


rat 


human eenomic 


Alul 


baboon 


human genomic 


HindlJI 


baboon 


human genomic 


Bam HI 


baboon 


human genomic 


Pvu\ 


baboon 


mouse uenomic 


AM 


rat 


mouse genomic 


Hindlll 


rat 


rat genomic 


AM 


rat 


rat genomic 


Hindlll 


rat 


baboon genomic 


AM 


baboon 


baboon genomic 


Hindlll 


baboon 


pBSKS 


Pvul 


human 


pBSKS 


BamUl 


human 


pBKCMV 


Bam HI 


human 


pBKCMV 


Sam HI 


rat 
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TABLE 4B (cont'd) 


DNA 


DNA Digested With 


LDL 


pBKCMV 

* 


Bam HI 


baboon 


SRE-2A olifio 


none 


human 


SEQ ID NO:226 






E/C-1 A oligo 


. none 


human 


SEQ ID NO:227 






FAS-IA oligo 


none 


human 


SEQ ID NO:228 






ISRE-lAoligo 


none 


human 


SEQ 1DN0:229 







EXAMPLE 7 

DETECTION OF LDL-BOUND DNA IN HUMAN BLOOD 

Plasma lipoproteins are isolated from human blood according to the protocol previously 
described in Example 1. 6X Sample loading buffer (30% glycerol, 0.25% Xylene cyanole FF, 
0.25% bromophenol blue) is added to the samples in a 1:5 V/V ratio. Samples are underloaded 
into 30 y\ wells at the cathode edge of an 0.8% agarose gel in Tris-Acetate buffer (pH 7.85) and 
electrophoresis is accomplished using 100 Volt constant until the negatively charged tracking 
dye migrates at least 50% of the distance from the loading well to the anodic edge of the gel. 
Following electrophoresis, is stained for DNA with ethidium bromide prior to protein staining 
in a solution containing 50% V/V ethanol. 10% VA/ acetic add, and 0.25% Coomasie Brilliant 
Blue R-250 (CBB R-250, Bio-Rad Labs). If no DNA is detected by ethidium bromide staining, 
the agarose gel is subjected to Southern blot analysis using a labeled DNA probe. The DNA is 
labeled with a radioactive isotope (e.g., ^'P), a non-radioactive tag (DIG) or with any other 
standard DNA-labeling method known to one of skill in the art. Randomly synthesized, short 
oligonucleotides are used as the probe to detect, in a general fashion, whether or not DNA is 
bound to the isolated LDL. Controls include lanes containing known quantities of DNA, lanes 
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containing purified LDL digested with DNase I, and LDL bound to DNA made by mixing 
purified LDL and DNA according to the method described in Example 2. 

LDL isolated from humans with cancer and subjected to the above protocol will have 
detectable DNA bound to the LDL in quantities greater than the amount of DNA bound to LDL 
isolated from humans without cancer. 



EXAMPLE 8 

DETECTION OF SPECIFIC TYPES OF CANCERS WITH 
SEQUENCE SPECIFIC DNA PROBES 

Not only is it possible to identify the presence or absence of cancer in a living body 
using the invention technique (as described in Example 14 above), it is also possible to identify 
specific cancer types by using sequence specific DNA probes. For example, LDL-bound DNA 
isolated from a patient with colon cancer will have a different DNA sequence than the LDL- 
bound DNA isolated from a patient with a different cancer type, for example, breast cancer. 
Different DNA sequences bound to the LDL isolated from different cancer patients is 
determined by first isolating LDL from the blood of a person with an independently identified 
and known cancer type, using the protocol in Example 1 . This isolated LDL is then digested 
with various non-specific proteases to remove the LDL while retaining the DNA. This DNA is 
then sequenced using standard sequencing techniques. A list of the DNA sequences along with 
the type of cancer it is associated with is made. This list is then used to synthesize probes that 
can differentiate among the various types of cancer. These probes are used in screening of a 
patient wdth an unknown cancer type, or in the early detection of metastatic cancer, or as a 
general early screening technique for the presence or absence of specific cancer types. 
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EXAMPLE 9 

METHODS FOR THE DETERMINATION OF METASTATIC GENE TRANSFER VIA 

LIPOPROTEINS AS NATIVE VECTORS 

In order to determine the sequence of polynucleotides bound to endogenous LDL, 
plasma LDL and other apoB-containing lipoproteins are captured using a monoclonal antibody 
to a specific apoB epitope such as 2G8 which is immobilized on an inert, hydrophilic and 
highly porous polymer microbead. The LDL-DNA complex is then isolated by elution using 
affinity chromatography technology. DNA is further purified from the isolated LDL/DNA 
complex using standard DNA purification methodology such as phenol/chloroform extraction 
followed by ethanol precipitation. Alternatively, purified DNA is isolated from the affinity 
column using elution conditions that disrupt protein/DNA complexes but not protein/protein 
complexes (i.e.. antibody/LDL complex). The polynucleotide sequences are determined using 
the SRE. E/C, FAS, and lSRE-1 A oligonucleotides (SEQ ID NO:226, SEQ ID NO:227, SEQ 
ID NO:228, and SEQ ID NO:229. respectively) in a standard PCR^^ methodology in order to 
amplify polynucleotides with unknown sequences. The amplified PGR™ products (i.e.. 
polynucleotides) are then isolated by agarose gel electrophoresis and subsequent DNA 
sequencing techniques well known to the art. 

Alternatively, identification of polynucleotide sequences that are bound to endogenous 
human LDL is via the specific binding of LDL to a plastic matrix such as a 96 well ELISA 
(enzyme linked immunosorbant assay) plates coated with specific antibodies that bind to human 
LDL. In this embodiment, freshly isolated plasma containing endogenous lipoproteins is used 
to bind to the anti-human LDL antibodies using standard ELISA procedures lipoproteins to the 
an. The presence and specific sequence of polynucleotides prebound to the endogenous LDL in 
each is determined by PGR™ technology. 

Because many varying and different embodiments may be made within the scope of the 
inventive concept herein taught, and because many modifications may be made in the 
embodiments herein detailed in accordance with the descriptive requirement of the law, it is to 
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be understood that the details herein are to be interpreted as illustrative and not in a limiting 
sense. 



EXAMPLE 10 
LOW-DENSITY LIPOPROTEIN INTERACTS WITH 
HUMAN CYTOMEGALOVIRUS GENOMIC DNA 

DNA binding experiments with purified plasma lipoprotein fractions and human 
genomic DNA as well as several different plasmids indicate that purified LDL binds to human 
genomic DNA digested with different restriction enzymes (Alu I and Hind III). 

Purified LDL also bound to several different plasmids but its binding affinity for 
plasmid DNA containing the HCMV IE promotor region was significantly higher. It was 
shown that the binding of both LDL and VLDL to the HCMV IE promotor region and SRE, 
MSRE, ISRE, MISRE. EJC, FAS, and MFAS oligonucleotides. The E/C oligonucleotide was 
used in these DNA binding studies because this oligonucleotide contains both a binding site for 
members of the C/EBP transcription factor family, which are involved in the regulation of 
differentiation-dependent adipocyte gene expression, as well as an overlapping E-box motif 
which is generally recognized by the eukaryotic basic helix-loop-helix (b-HLH) transcriptional 
regulators. LDL clearly have a greater affinity for all of the oligonucleotides tested than do 
VLDL. This is most likely due to interference with protein-DNA interaction caused by either 
the presence of other apolipoproteins on the surface of VLDL or an mcreased net charge as a 
resuh of the increased lipid content of VLDL. 

The sequence specificity is illustrated by the fact that both LDL and VLDL show a 
decreased binding affinity for the mutated versions of the ISRE and FAS oligos (MISRE and 
MFAS respectively). In contrast, LDL showed an increased binding affinity for the mutated 
version of the SRE oligo (MSRE). It is possible that this mutated SRE sequence may be a 
better ligand for the putative DNA binding region of apo B present on LDL. The binding of 
both VLDL and LDL to the E/C oligonucleotide is not surprising since this oligo contains the 
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E-box motif which is a known binding site for b-HLH proteins and similar b-HLH regions have 
been identified in apoB present on VLDL and LDL. 

The affmity for the HCMV IE promoter is not immediately obvious since careful 
analysis does not reveal an exact copy of either a SRE, ISRE, FAS, or E/C sequence. However, 
the HCMV IE promoter region contains regulatory elements that are generally recognized by a 
large number of eukaryotic DNA-binding proteins, including a variety of different families of 
transcription factors, and it may therefore be possible that the identified b-HLH regions of apoB 
possess similar DNA binding properties. 

Another possibility is that other yet unidentified regions of apoB are involved in the 
binding to the HCMV IE promoter region. The fact that HDL in contrast to VLDL and LDL do 
not bind to any of the oligos tested suggests that the DNA binding results from the specific 
interaction with apo B. These data support the hypothesis that apo B contains DNA binding 
domains which show homology with the DNA binding domains of SREBP-L SREBP-2, ADD- 
1, and ISGF3y and that apo B containing lipoproteins therefore bind to specific nucleotide 
sequences similar to those bound by these known DNA binding proteins. 

Recent reports suggest a possible causal relationship between human cytomegalovirus 
(HCMV) and the development of atherosclerosis in humans. These reports together with data 
presented herein, which show that human LDL binds strongly to HCMV IE promoter 
sequences, led the inventors to investigate whether plasma LDL may play a role in the 
pathogenesis of HCMV induced atherosclerosis. 

To test this hypothesis, the inventors looked for HCMV DNA sequences in the purified 
plasma LDL fraction of human subjects who tested seropositive for HCMV by polymerase 
chain reaction (PGR). The results of these studies show that a PCR product of the expected size 
(170 bp) could be detected with both primer sets (MTR2 and IE) in the purified plasma LDL 
fraction of HCMV seropositive subjects. However, this 1 70 bp DNA fragment could not be 
detected in the plasma samples of these subjects (lanes 6-8). These data suggest that the use of 
purified plasma LDL fractions for detection of CMV nucleic acid sequences by PCR techniques 
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is more sensitive than when whole plasma samples are used. Furthermore, the increased yield 
of PCR products of the purified plasma LDL fractions strongly suggest that HCMV DNA is 
predominantly associated with LDL within the plasma pool of HCMV seropositive subjects. 

EXAMPLE 11 
LOW-DENSITY LIPOPROTEIN AS A 
NATURAL GENE TRANSFER VECTOR 
The discovery of the nucleic acid-binding properties apo B-lOO suggested that 
lipoproteins containing apoBlOO, as naturally occurring liposomes, may function as gene 
transfer agents. By using highly purified low-density lipoprotein as such an agent, the inventors 
were able to transfect cultured human skin fibroblasts in vitro and to express a green fluorescent 
protein reporter gene in vivo. The gene transfer mediated by low-density lipoprotein was more 
efficient that that mediated by LipoFectin. Low-density lipoprotein also did not exhibit any 
toxicity, immunogenicity, or serum inhibition. 

1. DNA-binding 

In the Examples above, it was shown that highly purified human LDL binds to nucleic 
acids in a specific fashion. In order to establish whether rat lipoproteins can bind nucleic acids 
in a similar fashion, DNA-binding experiments with different rat lipoprotein fractions were 
performed. A gel shift assay of linearized pBluescript KS and pBKCMV plasmid DNA and 
purified rat VLDL, LDL, and HDL fractions was performed. The data clearly demonstrate that 
the binding of nucleic acids is specific to the purified LDL fraction. 

The binding of LDL to DNA is exhibited by the retarded electrophoretic migration of 
DNA in agarose gel that is caused by the foraiation of complexes of higher molecular weight. 
In contrast, purified fractions of VLDL and HDL did not bind any of the DNA samples tested. 
The fact that purified HDL did not bind DNA was expected, since endogenous HDL does not 
contain apo B-100. Surprisingly, there was no apparent binding of DNA to apo B-100- 
containing VLDL. It is possible that the DNA-binding assay, which employs ethidium bromide 
staining to detect DNA, lacks sensitivity or that VLDL does not bind to DNA under the 
conditions of the DNA-binding assay. Another explanation could be a difference in the 
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confonnation of apo B-lOO present on LDL as opposed to VLDl. because of a difference in the 
lipid composition and protein content of the two lipoprotein fractions. 



2. In vitro cell transfection studies. 

Based on the findings of the DNA-binding assay, transfection studies were performed 
using a prebound complex of LDL and plasmid DNA that contained a reporter gene that 
encodes GFP. 

The data generated illustrated the successful transfection of how human skin fibroblasts 
with LDL and pEGFP-Nl plasmid DNA. The transfection process was monitored by 
expression of the GFP encoding gene and is driven by the HCMV IE promoter. In addition to 
fluorescent microscopic analysis, expression of GFP was confirmed by a qualitative ELISA 
using a primary antibody against recombinant GFP and an HRP-conjugated secondary antibody 
with a-phenylenediamine as a chromogenic substrate. 

Human skin fibroblasts transfected with LDL exhibited a significantly lower intensity of 
green fluorescence than did cells transfected with LipoFectin, indicating that the level of GFP 
expression was lower in these LDL-transfecied cells. When the percentage of positively 
transfected cells were compared, however, transfection with LDL yielded a higher percentage of 
transfected cells than did transfection with LipoFectin (20 to 30% and 60 to 70%, respectively). 
In addition, LipoFectin-mediated transfection resulted in green fluorescence in the cell 
cytoplasm and in the nuclei, whereas LDL-mediated transfection resulted in green fluorescence 
predominantly in the cytoplasm. 

Transfection assays in which LDL concentrations were as high as 250 g/ml of LDL 
protein produced no detectable effects on the confluence and viability of the cell cultures, 
whereas LipoFectin concentrations of 20 g/ml resulted in significant loss of cell viability. 
Control cells that were transfected with linearized pEGFP-Nl plasmid DNA only exhibited no 
fluorescence. 
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3. In vivo reporter gene expression. 

To evaluate whether LDL could be used as a vehicle for in vivo gene delivery, a 
prebound rat LDL-pEGFP-Nl complex was administered to l-month-old female Sprague- 
Dawley rats. Cryosections of the liver and heart tissues of the treated animals that had been 
excised 2 days after the LDL-pEGFP-Nl complex showed significant levels of green 
fluorescence indicative of EGFP expression as determined by fluorescent microscopy. 

The expression of GPP in the different tissues was confirmed by a qualitative ELISA 
using a primary antibody against recombinant GFP and an HRP-conjugaied secondary antibody 
with a-phenylenediamine as a chromogenic substrate. In contrast, only low levels of 
autofluorescence were observed in the cryosectioned tissues obtained from the control animals 
treated solely with linearized pEGFP-Nl DNA. These data demonstrate that purifled LDL can 
be used in a prebound complex with DNA as an in vivo gene delivery system. 



AH of the compositions and/or methods disclosed and claimed herein can be made and 
executed without undue experimentation in light of the presem disclosure. While the 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied to the 
compositions and/or methods and in the steps or in the sequence of steps of the method 
described herein without departing from the concept, spirit and scope of the invention. More 
specifically, it will be apparent that certain agents which are both chemically and 
physiologically related may be substituted for the agents described herein while the same or 
similar results would be achieved. All such similar substitutes and modifications apparent to 
those skilled in the art are deemed to be within the spirit, scope and concept of the invention as 
defined by the appended claims 
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(G) TELEPHONE: (512)418-3000 
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(D) STATE: TX 
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(ii) TITLE OF INVENTION: LIPOPROTEINS AS NUCLEIC ACID VECTORS 
(ill) NUMBER OF SEQUENCES: 229 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC coinpatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 (EPO) 
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(A) APPLICATION NUMBER: US 09/079,030 
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(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NX3MBER: US 08/874,807 

(B) FILING DATE: 13-JUN-1997 



(2) INFORMATION FOR SEQ ID NO: 1: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Glu Glu Glu Met Leu Glu Asn Val Ser Leu Val Cys Pro Lys Asp Ala 
15 10 15 

Thr Arg Phe Lys His Leu Arg Lys Tyr Thr Tyr Asn Tyr Glu Ala Glu 

20 25 30 

Ser Ser Ser Gly Val Pro Gly Thr Ala Asp Ser Arg Ser Ala Thr Arg 
35 40 45 

lie Asn Cys Lys Val Glu Leu Glu Val Pro Gin Leu Cys Ser Phe lie 
50 55 60 

Leu Lys Thr Ser Gin Cys Thr Leu Lys Glu Val Tyr Gly Phe Asn Pro 
65 70 75 80 

Glu Gly Lys Ala Leu Leu Lys Lys Thr Lys Asn Ser Glu Glu Phe Ala 

85 90 95 

Ala Ala Met Ser Arg Tyr Glu Leu Lys Leu Ala lie Pro Glu Gly Lys 

100 105 110 

Gin Val Phe Leu Tyr Pro Glu Lys Asp Glu Pro Thr Tyr lie Leu Asn 
115 120 125 

lie Lys Arg Gly lie He Ser Ala Leu Leu Val Pro Pro Glu Thr Glu 
130 135 140 

Glu Ala Lys Gin Val Leu Phe Leu Asp Thr Val Tyr Gly Asn Cys Ser 
145 150 155 160 

Thr His Phe Thr Val Lys Thr Arg Lys Gly Asn Val Ala Thr Glu He 

165 170 175 

Ser Thr Glu Arg Asp Leu Gly Gin Cys Asp Arg Phe Lys Pro He Arg 

180 185 190 

Thr Gly He Ser Pro Leu Ala Leu He Lys Gly Met Thr Arg Pro Leu 
195 200 205 
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Ser Thr Leu lie Ser Ser Ser Gin Ser Cys Gin Tyr Thr Leu Asp Ala 
210 215 220 

Lys Arg Lys His Val Ala Glu Ala lie Cys Lys Glu Gin His Leu Phe 
225 230 235 240 

Leu Pro Phe Ser Tyr Asn Asn Lys Tyr Gly Met Val Ala Gin Val Thr 

245 250 255 

Gin Thr Leu Lys Leu Glu Asp Thr Pro Lys lie Asn Ser Arg Phe Phe 

260 265 270 

Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe Glu Ser Thr Lys Ser 
275 280 285 

Thr Ser Pro Pro Lys Gin Ala Glu Ala Val Leu Lys Thr Leu Gin Glu 
290 295 300 

Leu Lys Lys Leu Thr lie Ser Glu Gin Asn lie Gin Arg Ala Asn Leu 
305 310 315 320 

Phe Asn Lys Leu Val Thr Glu Leu Arg Gly Leu Ser Asp Glu Ala Val 

325 330 335 

Thr Ser Leu Leu Pro Gin Leu lie Glu Val Ser Ser Pro lie Thr Leu 

340 345 350 

Gin Ala Leu Val Gin Cys Gly Gin Pro Gin Cys Ser Thr His He Leu 
355 360 365 

Gin Trp Leu Lys Arg Val His Ala Asn Pro Leu Leu He Asp Val Val 
370 375 380 

Thr Tyr Leu Val Ala Leu He Pro Glu Pro Ser Ala Gin Gin Leu Arg 
385 390 395 400 

Glu He Phe Asn Met Ala Arg Asp Gin Arg Ser Arg Ala Thr Leu Tyr 

405 410 415 

Ala Leu Ser His Ala Val Asn Asn Tyr His Lys Thr Asn Pro Thr Gly 

420 425 430 

Thr Gin Glu Leu Leu Asp He Ala Asn Tyr Leu Met Glu Gin He Gin 
435 440 445 

Asp Asp Cys Thr Gly Asp Glu Asp Tyr Thr Tyr Leu He Leu Arg Val 
450 455 460 

He Gly Asn Met Gly Gin Thr Met Glu Gin Leu Thr Pro Glu Leu Lys 
465 470 475 480 

Ser Ser He Leu Lys Cys Val Gin Ser Thr Lys Pro Ser Leu Met He 

485 490 495 
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Gin Lys Ala Ala lie Gin Ala Leu Arg Lys Met Glu Pro Lys Asp Lys 

500 505 510 

Asp Gin Glu Val Leu Leu Gin Thr Phe Leu Asp Asp Ala Ser Pro Gly 
515 520 525 

Asp Lys Arg Leu Ala Ala Tyr Leu Met Leu Met Arg Ser Pro Ser Gin 
530 535 540 

Ala Asp lie Asn Lys lie Val Gin lie Leu Pro Trp Glu Gin Asn Glu 
545 550 555 560 

Gin Val Lys Asn Phe Val Ala Ser His lie Ala Asn lie Leu Asn Ser 

565 570 575 

Glu Glu Leu Asp lie Gin Asp Leu Lys Lys Leu Val Lys Glu Ala Leu 

580 585 590 

Lys Glu Ser Gin Leu Pro Thr Val Met Asp Phe Arg Lys Phe Ser Arg 
595 600 605 

Asn Tyr Gin Leu Tyr Lys Ser Val Ser Leu Pro Ser Leu Asp Pro Ala 
610 615 620 

Ser Ala Lys lie Glu Gly Asn Leu lie Phe Asp Pro Asn Asn Tyr Leu 
625 630 635 640 

Pro Lys Glu Ser Met Leu Lys Thr Thr Leu Thr Ala Phe Gly Phe Ala 

645 650 655 

Ser Ala Asp Leu lie Glu lie Gly Leu Glu Gly Lys Gly Phe Glu Pro 

660 665 670 

Thr Leu Glu Ala Leu Phe Gly Lys Gin Gly Phe Phe Pro Asp Ser Val 
675 680 685 

Asn Lys Ala Leu Tyr Trp Val Asn Gly Gin Val Pro Asp Gly Val Ser 
690 695 700 

Lys Val Leu Val Asp His Phe Gly Tyr Thr Lys Asp Asp Lys His Glu 
705 710 715 720 

Gin Asp Met Val Asn Gly lie Met Leu Ser Val Glu Lys Leu He Lys 

725 730 735 

Asp Leu Lys Ser Lys Glu Val Pro Glu Ala Arg Ala Tyr Leu Arg He 

740 745 750 

Leu Gly Glu Glu Leu Gly Phe Ala Ser Leu His Asp Leu Gin Leu Leu 
755 760 765 

Gly Lys Leu Leu Leu Met Gly Ala Arg Thr Leu Gin Gly He Pro Gin 
770 775 780 
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Met lie Gly Glu Val He Arg Lys Gly Ser Lys Asn Asp Phe Phe Leu 
785 790 795 800 

His Tyr He Phe Met Glu Asn Ala Phe Glu Leu Pro Thr Gly Ala Gly 

805 810 815 

Leu Gin Leu Gin He Ser Ser Ser Gly Val He Ala Pro Gly Ala Lys 

820 825 830 

Ala Gly Val Lys Leu Glu Val Ala Asn Met Gin Ala Glu Leu Val Ala 

835 840 845 

Lys Pro Ser Val Ser Val Glu Phe Val Thr Asn Met Gly He He He 
850 855 860 

Pro Asp Phe Ala Arg Ser Gly Val Gin Met Asn Thr Asn Phe Phe His 
865 870 875 880 

Glu Ser Gly Leu Glu Ala His Val Ala Leu Lys Ala Gly Lys Leu Lys 

885 890 895 

Phe He He Pro Ser Pro Lys Arg Pro Val Lys Leu Leu Ser Gly Gly 

900 905 910 

Asn Thr Leu His Leu Val Ser Thr Thr Lys Thr Glu Val He Pro Pro 
915 920 925 

Leu He Glu Asn Arg Gin Ser Trp Ser Val Cys Lys Gin Val Phe Pro 
930 935 940 

Gly Leu Asn Tyr Cys Thr Ser Gly Ala Tyr Ser Asn Ala Ser Ser Thr 
945 950 955 960 

Asp Ser Ala Ser Tyr Tyr Pro Leu Thr Gly Asp Thr Arg Leu Glu Leu 

965 970 975 

Glu Leu Arg Pro Thr Gly Glu He Glu Gin Tyr Ser Val Ser Ala Thr 

980 985 990 

Tyr Glu Leu Gin Arg Glu Asp Arg Ala Leu Val Asp Thr Leu Lys Phe 
995 1000 1005 

Val Thr Gin Ala Glu Gly Ala Lys Gin Thr Glu Ala Thr Met Thr Phe 
1010 1015 1020 

Lys Tyr Asn Arg Gin Ser Met Thr Leu Ser Ser Glu Val Gin He Pro 
1025 1030 1035 1040 

Asp Phe Asp Val Asp Leu Gly Thr He Leu Arg Val Asn Asp Glu Ser 

1045 1050 1055 

Thr Glu Gly Lys Thr Ser Tyr Arg Leu Thr Leu Asp He Gin Asn Lys 

1060 1065 1070 
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Lys He Thr Glu Val Ala Leu Met Gly His Leu Ser Cys Asp Thr Lys 
1075 1080 1085 

Glu Glu Arg Lys He Lys Gly Val He Ser He Pro Arg Leu Gin Ala 
1090 1095 1100 

Glu Ala Arg Ser Glu He Leu Ala His Trp Ser Pro Ala Lys Leu Leu 
1105 1110 1115 1120 

Leu Gin Met Asp Ser Ser Ala Thr Ala Tyr Gly Ser Thr Val Ser Lys 

1125 1130 1135 

Arg Val Ala Trp His Tyr Asp Glu Glu Lys He Glu Phe Glu Trp Asn 

1140 1145 1150 

Thr Gly Thr Asn Val Asp Thr Lys Lys Met Thr Ser Asn Phe Pro Val 
1155 1160 1165 

Asp Leu Ser Asp Tyr Pro Lys Ser Leu His Met Tyr Ala Asn Arg Leu 
1170 1175 1180 

Leu Asp His Arg Val Pro Glu Thr Asp Met Thr Phe Arg His Val Gly 
1185 1190 1195 1200 

Ser Lys Leu He Val Ala Met Ser Ser Trp Leu Gin Lys Ala Ser Gly 

1205 1210 1215 

Ser Leu Pro Tyr Thr Gin Thr Leu Gin Asp His Leu Asn Ser Leu Lys 

1220 1225 1230 

Glu Phe Asn Leu Gin Asn Met Gly Leu Pro Asp Phe His He Pro Glu 
1235 1240 1245 

Asn Leu Phe Leu Lys Ser Asp Gly Arg Val Lys Tyr Thr Leu Asn Lys 
1250 1255 1260 

Asn Ser Leu Lys He Glu He Pro Leu Pro Phe Gly Gly Lys Ser Ser 
1265 1270 1275 1280 

Arg Asp Leu Lys Met Leu Glu Thr Val Arg Thr Pro Ala Leu His Phe 

1285 1290 1295 

Lys Ser Val Gly Phe His Leu Pro Ser Arg Glu Phe Gin Val Pro Thr 

1300 1305 1310 

Phe Thr He Pro Lys Leu Tyr Gin Leu Gin Val Pro Leu Leu Gly Val 
1315 1320 1325 

Leu Asp Leu Ser Thr Asn Val Tyr Ser Asn Leu Tyr Asn Trp Ser Ala 
1330 1335 1340 

Ser Tyr Ser Gly Gly Asn Thr Ser Thr Asp His Phe Ser Leu Arg Ala 
1345 1350 1355 1360 
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Arg Tyr His Met Lys Ala Asp Ser Val Val Asp Leu Leu Ser Tyr Asn 

1365 1370 1375 

Val Gin Gly Ser Gly Glu Thr Thr Tyr Asp His Lys Asn Thr Phe Thr 

1380 1385 1390 

Leu Ser Cys Asp Gly Ser Leu Arg His Lys Phe Leu Asp Ser Asn He 
1395 1400 1405 

Lys Phe Ser His Val Glu Lys Leu Gly Asn Asn Pro Val Ser Lys Gly 
1410 1415 1420 

Leu Leu He Phe Asp Ala Ser Ser Ser Trp Gly Pro Gin Met Ser Ala 
1425 1430 1435 1440 

Ser val His Leu Asp Ser Lys Lys Lys Gin His Leu Phe Val Lys Glu 

1445 1450 1455 

Val Lys He Asp Gly Gin Phe Arg Val Ser Ser Phe Tyr Ala Lys Gly 

1460 1465 1470 

Thr Tyr Gly Leu Ser Cys Gin Arg Asp Pro Asn Thr Gly Arg Leu Asn 
1475 1480 1485 

Gly Glu Ser Asn Leu Arg Phe Asn Ser Ser Tyr Leu Gin Gly Thr Asn 
1490 1495 1500 

Gin He Thr Gly Arg Tyr Glu Asp Gly Thr Leu Ser Leu Thr Ser Thr 
1505 1510 1515 1520 

Ser Asp Leu Gin Ser Gly He He Lys Asn Thr Ala Ser Leu Lys Tyr 

1525 1530 1535 

Glu Asn Tyr Glu Leu Thr Leu Lys Ser Asp Thr Asn Gly Lys Tyr tys 

1540 1545 1550 

Asn Phe Ala Thr Ser Asn Lys Met Asp Met Thr Phe Ser Lys Gin Asn 
1555 1560 1565 

Ala Leu Leu Arg Ser Glu Tyr Gin Ala Asp Tyr Glu Ser Leu Arg Phe 
1570 1575 1580 

Phe ser Leu Leu Ser Gly Ser Leu Asn Ser His Gly Leu Glu Leu Asn 
1585 1590 1595 1600 

Ala Asp He Leu Gly Thr Asp Lys He Asn Ser Gly Ala His Lys Ala 

1605 1610 1615 

Thr Leu Arg He Gly Gin Asp Gly He Ser Thr Ser Ala Thr Thr Asn 

1620 1625 1630 

l«u Lys Cys ser Leu Leu Val Leu Glu Asn Glu Leu Asn Ala Glu Leu 
1635 1640 1645 
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Gly Leu ser Gly Ala Ser Met Lys Leu Thr Thr Asn Gly Arg Phe Arg 
1650 1655 1660 

Glu His Asn Ala Lys Phe Ser Leu Asp Gly Lys Ala Ala Leu Thr Glu 
1665 1670 1675 1680 

Leu Ser Leu Gly Ser Ala Tyr Gin Ala Met He Leu Gly Val Asp Ser 

1685 1690 1695 

Lys Asn He Phe Asn Phe Lys Val Ser Gin Glu Gly Leu Lys Leu Ser 

1700 1705 1710 

Asn Asp Met Met Gly Ser Tyr Ala Glu Met Lys Phe Asp His Thr Asn 

1715 1720 1725 

Ser Leu Asn He Ala Gly Leu Ser Leu Asp Phe Ser Ser Lys Leu Asp 
1730 1735 1740 

Asn He Tyr Ser Ser Asp Lys Phe Tyr Lys Gin Thr Val Asn Leu Gin 
1745 1750 1755 1760 

Leu Gin Pro Tyr Ser Leu Val Thr Thr Leu Asn Ser Asp Leu Lys Tyr 

1765 1770 1775 

Asn Ala Leu Asp Leu Thr Asn Asn Gly Lys Leu Arg Leu Glu Pro Leu 

1780 1785 1790 

Lys Leu His Val Ala Gly Asn Leu Lys Gly Ala Tyr Gin Asn Asn Glu 
1795 1800 1B05 

He Lys His He Tyr Ala He Ser Ser Ala Ala Leu Ser Ala Ser Tyr 
1810 1815 1820 

Lys Ala Asp Thr Val Ala Lys Val Gin Gly Val Glu Phe Ser His Arg 
1825 1B30 1835 1840 

Leu Asn Thr Asp He Ala Gly Leu Ala Ser Ala He Asp Met Ser Thr 

1845 1850 1855 

Asn Tyr Asn Ser Asp Ser Leu His Phe Ser Asn Val Phe Arg Ser Val 

1860 1865 1870 

Met Ala Pro Phe Thr Met Thr He Asp Ala His Thr Asn Gly Asn Gly 
1875 1880 1885 

Lys Leu Ala Leu Trp Gly Glu His Thr Gly Gin Leu Tyr Ser Lys Phe 
1890 1895 1900 

Leu Leu Lys Ala Glu Pro Leu Ala Phe Thr Phe Ser His Asp Tyr Lys 
1905 1910 1915 1920 

Gly Ser Thr Ser His His Leu Val Ser Arg Lys Ser He Ser Ala Ala 

1925 1930 1935 
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Leu Glu His Lys Val Ser Ala Leu Leu Thr Pro Ala Glu Gin Thr Gly 

1940 1945 "50 

Thr Trp Lys Leu Lys Thr Gin Phe Asn Asn Asn Glu Tyr Ser Gin Asp 
1955 

Leu ASP Ala Tyr Asn Thr Lys Asp Lys He Gly Val Glu Leu Thr Gly 
1970 IS^S 1980 

Arg Thr Leu Ala Asp Leu Thr Leu Leu Asp Ser^Pro He Lys Val Pro^ 



1985 



1990 



Leu Leu Leu Ser Glu Pro He Asn He He Asp Ala Leu Glu Met Arg 

2005 2010 2015 

Asp Ala val Glu Lys Pro Gin Glu Phe Thr He Val Ala Phe Val Lys 

2020 2025 2030 

Tyr Asp Lys Asn Gin Asp Val His Ser He Asn Leu Pro Phe Phe Glu 
2035 2040 2045 

Thr Leu Gin Glu Tyr Phe Glu Arg Asn Arg Gin Thr He He Val Val 
2050 2055 2060 

val Glu Asn Val Gin Arg Asn Leu Lys His He Asn He Asp Gin Phe 
2065 2070 2075 2080 

val Arg Lys Tyr Arg Ala Ala Leu Gly Lys Leu Pro Gin Gin Ala Asn 

2085 2090 2095 

ASP Tyr Leu Asn Ser Phe Asn Trp Glu Arg Gin val Ser His Ala Lys 

2100 2105 2110 

Glu Lys Leu Thr Ala Leu Thr Lys Lys Tyr Arg He Thr Glu Asn Asp 
2115 2120 2125 

He Gin He Ala Leu Asp Asp Ala Lys He Asn Phe Asn Glu Lys Leu 
2130 2135 2140 

ser Gin Leu Gin Thr Tyr Met He Gin Phe Asp Gin Tyr He Lys Asp 
2145 2150 2155 2160 

ser Tyr Asp Leu His Asp Leu Lys He Ala He Ala Asn He He Asp 

2165 2170 2175 

Glu He He Glu Lys Leu Lys Ser Leu Asp Glu His Tyr His He Arg 

2180 2185 2190 

val Asn Leu val Lys Thr He His Asp Leu His Leu Phe He Glu Asn 
2195 2200 2205 

He Asp Phe Asn Lys Ser Gly Ser Ser Thr Ala Ser Trp He Gin Asn 
2210 2215 2220 
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Val Asp Thr Lys Tyr Gin lie Arg lie Gin lie Gin Glu Lys Leu Gin 
2225 2230 2235 2240 

Gin Leu Lys Arg His lie Gin Asn He Asp He Gin His Leu Ala Gly 

2245 2250 2255 

Lys Leu Lys Gin His He Glu Ala He Asp Val Arg Val Leu Leu Asp 

2260 2265 2270 

Gin Leu Gly Thr Thr He Ser Phe Glu Arg He Asn Asp Val Leu Glu 
2275 2280 2285 

His Val Lys His Phe Val He Asn Leu He Gly Asp Phe Glu Val Ala 
2290 2295 2300 

Glu Lys He Asn Ala Phe Arg Ala Lys Val His Glu Leu He Glu Arg 
2305 2310 2315 2320 

Tyr Glu Val Asp Gin Gin He Gin Val Leu Met Asp Lys Leu Val Glu 

2325 2330 2335 

Leu Thr His Gin Tyr Lys Leu Lys Glu Thr He Gin Lys Leu Ser Asn 

2340 2345 2350 

Val Leu Gin Gin Val Lys He Lys Asp Tyr Phe Glu Lys Leu Val Gly 
2355 2360 2365 

Phe He Asp Asp Ala Val Lys Lys Leu Asn Glu Leu Ser Phe Lys Thr 
2370 2375 2380 

Phe He Glu Asp Val Asn Lys Phe Leu Asp Met Leu He Lys Lys Leu 
2385 2390 2395 2400 

Lys Ser Phe Asp Tyr His Gin Phe Val Asp Glu Thr Asn Asp Lys He 

2405 2410 2415 

Arg Glu Val Thr Gin Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu 

2420 2425 2430 

Pro Gin Lys Ala Glu Ala Leu Lys Leu Phe Leu Glu Glu Thr Lys Ala 
2435 2440 2445 

Thr Val Ala Val Tyr Leu Glu Ser Leu Gin Asp Thr Lys He Thr Leu 
2450 2455 2460 

He He Asn Trp Leu Gin Glu Ala Leu Ser Ser Ala Ser Leu Ala His 
2465 2470 2475 2480 

Met Lys Ala Lys Phe Arg Glu Thr Leu Glu Asp Thr Arg Asp Arg Met 

2485 2490 2495 

Tyr Asp Met Asp He Gin Gin Glu Leu Gin Arg Tyr Leu Ser Leu Val 

2500 2505 2510 
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Gly Gin Val Tyr Ser Thr Leu Val Thr Tyr He Ser Asp Trp Trp Thr 
2515 2520 2525 

Leu Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin 
2530 2535 2540 

Asp Trp Ala Lys Arg Met Lys Ala Leu Val Glu Gin Gly Phe Thr Val 
2545 2550 2555 2560 

Pro Glu He Lys Thr He Leu Gly Thr Met Pro Ala Phe Glu Val Ser 

2565 2570 2575 

Leu Gin Ala Leu Gin Lys Ala Thr Phe Gin Thr Pro Asp Phe He Val 

2580 2585 2590 

Pro Leu Thr Asp Leu Arg He Pro Ser Val Gin He Asn Phe Lys Asp 
2595 2600 2605 

Leu Lys Asn He Lys He Pro Ser Arg Phe Ser Thr Pro Glu Phe Thr 
2610 2615 2620 

He Leu Asn Thr Phe His He Pro Ser Phe Thr He Asp Phe Val Glu 
2625 2630 2635 2640 

Met Lys Val Lys He He Arg Thr He Asp Gin Met Gin Asn Ser Glu 

2645 2650 2655 

Leu Gin Trp Pro Val Pro Asp He Tyr Leu Arg Asp Leu Lys Val Glu 

2660 2665 2670 

Asp He Pro Leu Ala Arg He Thr Leu Pro Asp Phe Arg Leu Pro Glu 
2675 • 2680 2685 

He Ala He Pro Glu Phe He He Pro Thr Leu Asn Leu Asn Asp Phe 
2690 2695 2700 

Gin Val Pro Asp Leu His He Pro Glu Phe Gin Leu Pro His He Ser 
2705 2710 2715 2720 

His Thr He Glu Val Pro Thr Phe Gly Lys Leu Tyr Ser He Leu Lys 

2725 2730 2735 

He Gin Ser Pro Leu Phe Thr Leu Asp Ala Asn Ala Asp He Gly Asn 

2740 2745 2750 

Gly Thr Thr Ser Ala Asn Glu Ala Gly He Ala Ala Ser He Thr Ala 
2755 2760 2765 

Lys Gly Glu Ser Lys Leu Glu Val Leu Asn Phe Asp Phe Gin Ala Asn 

2770 2775 2780 

Ala Gin Leu Ser Asn Pro Lys He Asn Pro Leu Ala Leu Lys Glu Ser 

2785 2790 2795 2800 
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Val Lys Phe Ser Ser Lys Tyr Leu Arg Thr Glu His Gly Ser Glu Met 

2805 2810 2815 

Leu Phe Phe Gly Asn Ala lie Glu Gly Lys Ser Asn Thr Val Ala Ser 

2820 2825 2830 

Leu His Thr Glu Lys Asn Thr Leu Glu Leu Ser Asn Gly Val He Val 
2835 2840 2845 

Lys He Asn Asn Gin Leu Thr Leu Asp Ser Asn Thr Lys Tyr Phe His 
2850 2655 2860 

Lys Leu Asn He Pro Lys Leu Asp Phe Ser Ser Gin Ala Asp Leu Arg 
2865 2870 2875 2880 

Asn Glu lie Lys Thr Leu Leu Lys Ala Gly His He Ala Trp Thr Ser 

2885 2890 2895 

Ser Gly Lys Gly Ser Trp Lys Trp Ala Cys Pro Arg Phe Ser Asp Glu 

2900 2905 2910 

Gly Thr His Glu Ser Gin He Ser Phe Thr He Glu Gly Pro Leu Thr 
2915 2920 2925 

Ser Phe Gly Leu Ser Asn Lys He Asn Ser Lys His Leu Arg Val Asn 
2930 2935 2940 

Gin Asn Leu Val Tyr Glu Ser Gly Ser Leu Asn Phe Ser Lys Leu Glu 
2945 2950 2955 2960 

He Gin Ser Gin Val Asp Ser Gin His Val Gly His Ser Val Leu Thr 

2965 2970 2975 

Ala Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala Glu Phe Thr Gly 

2980 2985 2990 

Arg His Asp Ala His Leu Asn Gly Lys Val He Gly Thr Leu Lys Asn 
2995 3000 3005 

Ser Leu Phe Phe Ser Ala Gin Pro Phe Glu He Thr Ala Ser Thr Asn 
3010 3015 3020 

Asn Glu Gly Asn Leu Lys Val Arg Phe Pro Leu Arg Leu Thr Gly Lys 
3025 3030 3035 3040 

He Asp Phe Leu Asn Asn Tyr Ala Leu Phe Leu Ser Pro Ser Ala Gin 

3045 3050 3055 

Gin Ala Ser Trp Gin Val Ser Ala Arg Phe Asn Gin Tyr Lys Tyr Asn 

3060 3065 3070 

Gin Asn Phe Ser Ala Gly Asn Asn Glu Asn He Met Glu Ala His Val 
3075 3080 3085 
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Gly lie Asn Gly Glu Ala Asn Leu Asp 
3090 3095 



Phe Leu Asn lie Pro Leu Thr 
3100 



He Pro Glu Met Arg Leu Pro Tyr Thr He He Thr Thr Pro Pro Leu 
3105 3110 3115 3120 

Lys Asp Phe Ser Leu Trp Glu Lys Thr Gly Leu Lys Glu Phe Leu Lys 

3125 3130 3135 

Thr Thr Lys Gin Ser Phe Asp Leu Ser Val Lys Ala Gin Tyr Lys Lys 

3140 3145 3150 

Asn Lys His Arg His Ser He Thr Asn Pro Leu Ala Val Leu Cys Glu 
3155 3160 3165 

Phe He Ser Gin Ser He Lys Ser Phe Asp Arg His Phe Glu Lys Asn 
3170 3175 3180 

Arg Asn Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr Asn Glu Thr Lys 
3185 3190 3195 3200 

He Lys Phe Asp Lys Tyr Lys Ala Glu Lys Ser His Asp Glu Leu Pro 

3205 3210 3215 

Arg Thr Phe Gin He Pro Gly Tyr Thr Val Pro Val Val Asn Val Glu 

3220 3225 3230 

Val Ser Pro Phe Thr He Glu Met Ser Ala Phe Gly Tyr Val Phe Pro 
3235 3240 3245 

Lys Ala Val Ser Met Pro Ser Phe Ser He Leu Gly Ser Asp Val Arg 
3250 3255 3260 

Val Pro Ser Tyr Thr Leu He Leu Pro Ser Leu Glu Leu Pro Val Leu 
3265 3270 3275 3280 

His Val Pro Arg Asn Leu Lys Leu Ser Leu Pro His Phe Lys Glu Leu 

3285 3290 3295 

Cys Thr He Ser His He Phe He Pro Ala Met Gly Asn He Thr Tyr 

3300 3305 3310 

Asp Phe Ser Phe Lys Ser Ser Val He Thr Leu Asn Thr Asn Ala Glu 
3315 3320 3325 

Leu Phe Asn Gin Ser Asp He Val Ala His Leu Leu Ser Ser Ser Ser 
3330 3335 3340 

Ser Val He Asp Ala Leu Gin Tyr Lys Leu Glu Gly Thr Thr Arg Leu 
3345 3350 3355 336C 

Thr Arg Lys Arg Gly Leu Lys Leu Ala Thr Ala Leu Ser Leu Ser Asn 



3365 



3370 



3375 
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Lys Phe Val Glu Gly Ser His Asn Ser Thr Val Ser Leu Thr Thr Lys 

3380 3385 3390 

Asn Met Glu Val Ser Val Ala Lys Thr Thr Lys Ala Glu He Pro He 
3395 3400 3405 

Leu Arg Met Asn Phe Lys Gin Glu Leu Asn Gly Asn Thr Lys Ser Lys 
3410 3415 3420 

Pro Thr Val Ser Ser Ser Met Glu Phe Lys Tyr Asp Phe Asn Ser Ser 
3425 3430 3435 3440 

Met Leu Tyr Ser Thr Ala Lys Gly Ala Val Asp His Lys Leu Ser Leu 

3445 3450 3455 

Glu Ser Leu Thr Ser Tyr Phe Ser He Glu Ser Ser Thr Lys Gly Asp 

3460 3465 3470 

Val Lys Gly Ser Val Leu Ser Arg Glu Tyr Ser Gly Thr He Ala Ser 
3475 3480 3485 

Glu Ala Asn Thr Tyr Leu Asn Ser Lys Ser Thr Arg Ser Ser Val Lys 
3490 3495 3500 

Leu Gin Gly Thr Ser Lys He Asp Asp He Trp Asn Leu Glu Val Lys 
3505 3510 3515 3520 

Glu Asn Phe Ala Gly Glu Ala Thr Leu Gin Arg He Tyr Ser Leu Trp 

3525 3530 3535 

Glu His Ser Thr Lys Asn His Leu Gin Leu Glu Gly Leu Phe Phe Thr 

3540 3545 3550 

Asn Gly Glu His Thr Ser Lys Ala Thr Leu Glu Leu Ser Pro Trp Gin 
3555 3560 3565 

Met Ser Ala Leu Val Gin Val His Ala Ser Gin Pro Ser Ser Phe His 
3570 3575 3580 

Asp Phe Pro Asp Leu Gly Gin Glu Val Ala Leu Asn Ala Asn Thr Lys 
3585 3590 3595 3600 

Asn Gin Lys He Arg Trp Lys Asn Glu Val Arg He His Ser Gly Ser 

3605 3610 3615 

Phe Gin Ser Gin Val Glu Leu Ser Asn Asp Gin Glu Lys Ala His Leu 

3620 3625 3630 

Asp He Ala Gly Ser Leu Glu Gly His Leu Arg Phe Leu Lys Asn He 
3635 3640 3645 

He Leu Pro Val Tyr Asp Lys Ser Leu Trp Asp Phe Leu Lys Leu Asp 
3650 3655 3660 
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Val Thr Thr Ser lie Gly Arg Arg Gin His Leu Arg Val Ser Thr Ala 
3665 3670 3675 3680 

Phe Val Tyr Thr Lys Asn Pro Asn Gly Tyr Ser Phe Ser lie Pro Val 

368S * 3690 3695 

Lys Val Leu Ala Asp Lys Phe lie Thr Pro Gly Leu Lys Leu Asn Asp 

3700 3705 3710 

Leu Asn Ser val Leu Val Met Pro Thr Phe His Val Pro Phe Thr Asp 
3715 3720 3725 

Leu Gin Val Pro Ser Cys Lys Leu Asp Phe Arg Glu lie Gin lie Tyr 
3730 3735 3740 

Lys Lys Leu Arg Thr Ser Ser Phe Ala Leu Asn Leu Pro Thr Leu Pro 
3745 3750 3755 3760 

Glu Val Lys Phe Pro Glu Val Asp Val Leu Thr Lys Tyr Ser Gin Pro 

3765 3770 3775 

Glu Asp Ser Leu lie Pro Phe Phe Glu lie Thr Val Pro Glu Ser Gin 

3780 3785 3790 

Leu Thr Val Ser Gin Phe Thr Leu Pro Lys Ser Val Ser Asp Gly lie 
3795 3800 3805 

Ala Ala Leu Asp Leu Asn Ala Val Ala Asn Lys lie Ala Asp Phe Glu 
3810 3815 3820 

Leu Pro Thr He He Val Pro Glu Gin Thr He Glu He Pro Ser He 
3825 3830 3835 3840 

Lys Phe Ser Val Pro Ala Gly He Val He Pro Ser Phe Gin Ala Leu 

3845 3850 3855 

Thr Ala Arg Phe Glu Val Asp Ser Pro Val Tyr Asn Ala Thr Trp Ser 

3860 3865 3870 

Ala Ser Leu Lys Asn Lys Ala Asp Tyr Val Glu Thr Val Leu Asp Ser 
3875 3880 3885 

Thr Cys Ser Ser Thr Val Gin Phe Leu Glu Tyr Glu Leu Asn Val Leu 
3890 3895 3900 

Gly Thr His Lys He Glu Asp Gly Thr Leu Ala Ser Lys Thr Lys Gly 
3905 3910 3915 3920 

Thr Leu Ala His Arg Asp Phe Ser Ala Glu Tyr Glu Glu Asp Gly Lys 

3925 3930 3935 

Phe Glu Gly Leu Gin Glu Trp Glu Gly Lys Ala His Leu Asn He Lys 

3940 3945 3950 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCT/US98/11927 

- 106- 

Ser Pro Ala Phe Thr Asp Leu His Leu Arg Tyr Gin Lys Asp Lys Lys 
3955 3960 3965 

Gly He Ser Thr Ser Ala Ala Ser Pro Ala Val Gly Thr Val Gly Met 
3970 3975 3980 

Asp Met Asp Glu Asp Asp Asp Phe Ser Lys Trp Asn Phe Tyr Tyr Ser 
3985 3990 3995 4000 

Pro Gin Ser Ser Pro Asp Lys Lys Leu Thr He Phe Lys Thr Glu Leu 

4005 4010 4015 

Arg Val Arg Glu Ser Asp Glu Glu Thr Gin He Lys Val Asn Trp Glu 

4020 4025 4030 

Glu Glu Ala Ala Ser Gly Leu Leu Thr Ser Leu Lys Asp Asn Val Pro 
4035 4040 4045 

Lys Ala Thr Gly Val Leu Tyr Asp Tyr Val Asn Lys Tyr His Trp Glu 
4050 4055 4060 

His Thr Gly Leu Thr Leu Arg Glu Val Ser Ser Lys Leu Arg Arg Asn 
4065 4070 4075 4080 

Leu Gin Asn Asn Ala Glu Trp val Tyr Gin Gly Ala He Arg Gin He 

4085 4090 4095 

Asp Asp He Asp Val Arg Phe Gin Lys Ala Ala Ser Gly Thr Thr Gly 

4100 4105 4110 

Thr Tyr Gin Glu Trp Lys Asp Lys Ala Gin Asn Leu Tyr Gin Glu Leu 
4115 4120 4125 

Leu Thr Gin Glu Gly Gin Ala Ser Phe Gin Gly Leu Lys Asp Asn Val 
4130 4135 4140 

Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His Met Lys Val Lys 
4145 4150 4155 4160 

His Leu He Asp Ser Leu He Asp Phe Leu Asn Phe Pro Arg Phe Gin 

4165 4170 4175 

Phe Pro Gly Lys Pro Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met 

4180 4185 4190 

Phe He Arg Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val 
4195 4200 4205 

His Asn Gly Ser Glu He Leu Phe Ser Tyr Phe Gin Asp Leu Val He 
4210 4215 4220 

Thr Leu Pro Phe Glu Leu Arg Lys His Lys Leu He Asp Val He Ser 
4225 4230 4235 4240 
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Met Tyr Arg Glu Leu Leu Lys Asp Leu Ser Lys Glu Ala Gin Glu Val 

4245 4250 4255 

Phe Lys Ala lie Gin Ser Leu Lys Thr Thr Glu Val Leu Arg Asn Leu 

4260 4265 4270 

Gin Asp Leu Leu Gin Phe lie Phe Gin Leu lie Glu Asp Asn lie Lys 
4275 4280 4285 

Gin Leu Lys Glu Met Lys Phe Thr Tyr Leu lie Asn Tyr lie Gin Asp 
4290 4295 4300 

Glu lie Asn Thr lie Phe Asn Asp Tyr lie Pro Tyr Val Phe Lys Leu 
4305 4310 4315 4320 

Leu Lys Glu Asn Leu Cys Leu Asn Leu His Lys Phe Asn Glu Phe lie 

4325 4330 4335 

Gin Asn Glu Leu Gin Glu Ala Ser Gin Glu Leu Gin Gin lie His Gin 

4340 4345 4350 

Tyr lie Met Ala Leu Arg Glu Glu Tyr Phe Asp Pro Ser He Val Gly 
4355 4360 4365 

Trp Thr Val Lys Tyr Tyr Glu Leu Glu Glu Lys He Val Ser Leu He 
4370 4375 4380 

Lys Asn Leu Leu Val Ala Leu Lys Asp Phe His Ser Glu Tyr He Val 
4385 4390 4395 4400 

Ser Ala Ser Asn Phe Thr Ser Gin Leu Ser Ser Gin Val Glu Gin Phe 

4405 4410 4415 

Leu His Arg Asn He Gin Glu Tyr Leu Ser He Leu Thr Asp Pro Asp 

4420 4425 4430 

Gly Lys Gly Lys Glu Lys He Ala Glu Leu Ser Ala Thr Ala Gin Glu 
4435 4440 4445 

He He Lys Ser Gin Ala He Ala Thr Lys Lys He He Ser Asp Tyr 
4450 4455 4460 

His Gin Gin Phe Arg Tyr Lys Leu Gin Asp Phe Ser Asp Gin Leu Ser 
4465 4470 4475 4480 

Asp Tyr Tyr Glu Lys Phe He Ala Glu Ser Lys Arg Leu He Asp Leu 

4485 4490 4495 

Ser He Gin Asn Tyr His Thr Phe Leu He Tyr He Thr Glu Leu Leu 

4500 4505 4510 

Lys Lys Leu Gin Ser Thr Thr Val Met Asn Pro Tyr Met Lys Leu Ala 
4515 4520 4525 
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Pro Gly Glu Leu Thr lie lie Leu 
4530 4535 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Pro Xaa Pro 
1 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Lys Tyr Thr Tyr Asn Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly 
15 10 15 

Thr Ala Asp Ser Arg Ser Ala Thr Arg lie Asn Cys Lys Val Glu Leu 

20 25 30 

Glu Val Pro Gin Leu Cys Ser Phe lie Leu Lys Thr Ser Gin 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Ala Tyr Asp Phe Asn Tyr Pro lie Lys Lys Asp Ser Ser Ser Gin Leu 
15 10 15 

Leu Ser Val Gin Gin Gly Glu Thr He Tyr He Leu Asn Lys Asn Ser 

20 25 30 
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Ser Gly Trp Trp Asp Gly heu Val lie Asp Asp Ser Asn 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Val Tyr Gly Phe Asn Pro Glu Gly Lys Ala Leu Leu Lys Lys Thr Lys 
15 10 15 

Asn Ser Glu Glu Phe Ala Ala Ala Met Ser Arg Tyr Glu Leu Lys Leu 

20 25 30 

Ala lie Pro Glu Gly Lys Gin Val Phe Leu Tyr Pro Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 4 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Leu Tyr Asp Phe Val Ala Ser Gly Asp Asn Thr Leu Ser lie Thr Lys 
1*5 10 15 

Gly Glu Lys Leu Arg Val Leu Gly Tyr Asn His Tyr Asn Gly Glu Trp 

20 25 30 

Cys Glu Ala Gin Thr Lys Asn Gly Gin Gly Trp Val Pro Ser Asn 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
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Phe Leu Pro Phe 
1 

Thr Gin Thr Leu 

20 

Phe Gly Glu Gly 
35 



Ser Tyr Asn Asn 
5 

Lys Leu Glu Asp 



Thr Lys Lys Met 

40 



110- 

Lys Tyr Gly Met 
10 

Thr Pro Lys lie 
25 

Gly Leu Ala Phe 



Val Ala Gin Val 
15 

Asn Ser Arg Phe 
30 



(2) INFORMATION FOR SEQ ID NO; 8: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Leu Phe Asp Tyr Lys Ala Gin Arg Glu Asp Glu Leu Thr Phe Thr Lys 
15 10 15 

Ser Ala He He Gin Asn Val Glu Lys Gin Glu Gly Gly Trp Trp Arg 

20 25 30 

Gly Asp Tyr Gly Gly Lys Lys Gin Leu Trp Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Phe Leu Pro Phe Ser Tyr Asn Asn Lys Tyr Gly Met Val Ala Gin Val 
15 10 15 

Thr Gin Thr Leu Lys Leu Glu Asp Thr Pro Lys He Asn Ser Arg Phe 

20 25 30 

Phe Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Leu His Ser Tyr Glu Pro Ser His Asp Gly Asp Leu Gly Phe Glu Lys 
15 10 15 

Gly Glu Gin Leu Arg lie Leu Glu Gin Ser Gly Glu Trp Trp Lys Ala 

20 25 30 

Gin Ser Leu Thr Thr Gly Gin Glu Gly Phe He Pro Phe Asn 
35 40 45 



SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 62 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) 

Tyr Thr Tyr Leu He Leu Arg Val 
1 5 

Glu Gin Leu Thr Pro Glu Leu Lys 

20 

Ser Thr Lys Pro Ser Leu Met He 
35 40 

Arg Lys Met Glu Pro Lys Asp Lys 
50 55 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



He Gly Asn Met Gly Gin Thr Met 
10 15 

Ser Ser He Leu Lys Cys Val Gin 
25 30 

Gin Lys Ala Ala He Gin Ala Leu 

45 

Asp Gin Glu Val Leu Leu 

60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Val Val Ala Leu Phe Asp Tyr Ala Ala Val Asn Asp Arg Asp Leu Gin 
15 10 15 

Val Leu Lys Gly Glu Lys Leu Gin Val Leu Arg Ser Thr Gly Asp Trp 

20 25 30 
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Trp Leu Ala Arg Ser Leu Val Thr Gly Arg Glu Gly Tyr Val Pro Ser 
35 40 45 

Asn Phe Val Ala Pro 
50 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Ala Phe Gly Phe Ala Ser Ala Asp Leu lie Glu lie Gly Leu Glu Gly 

Lys Gly Phe Glu Pro Thr Leu Glu Ala Leu Phe Gly Lys Gin Gly Phe 

20 25 30 

Phe Pro Asp Ser Val Asn Lys Ala Leu Tyr Trp Val Asn Gly Gin Val 
35 40 45 

Pro Asp 
50 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Leu Tyr Asp Phe Ala Ala Glu Asn Pro Asp Glu Leu Thr Phe T^n Glu 
15 10 

Gly Ala Val Val Thr Val lie Asn Lys Ser Asn Pro Asp Trp Trp Glu 

20 25 30 

Gly Glu Leu Asn Gly Gin Arg Gly Val Phe Pro Ala Ser Tyr Val Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 
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(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Phe Gly Tyr Thr Lys Asp Asp Lys His Glu Gin Asp Met Val Asn Gly 
15 10 15 

lie Met Leu Ser Val Glu Lys Leu lie Lys Asp Leu Lys Ser Lys Glu 

20 25 30 

Val Pro Glu Ala Arg Ala Tyr Leu Arg lie Leu Gly Glu Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO; 16; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 i nea r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Tyr Asp Tyr Lys Lys Glu Glu Glu Asp lie Asp Leu His Leu Gly Asp 
15 10 15 

lie Leu Thr Val Asn Lys Gly Ser Leu Val Ala Leu Gly Phe Ser Asp 

20 25 30 

Gly Gin Glu Ala Lys Pro Glu Glu lie Gly Trp Leu Asn Gly Tyr Asn 
35 40 45 

Glu 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Phe Asp Tyr His Gin Phe Val Asp Glu Thr Asn Asp Lys He Arg Glu 
15 10 15 

Val Thr Gin Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu Pro Gin 

20 25 30 

Lys Ala Glu Ala Leu Lys Leu Phe Leu Glu Glu Thr Lys Ala Thr Val 
35 40 45 
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Ala Val Tyr Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Tyr Asp Tyr Gin Glu Lys Ser Pro Arg Glu Val Thr Met Lys Lys Gly 
15 10 15 

Asp He Leu Thr Leu Leu Asn Ser Thr Asn Lys Asp Trp Trp Lys Val 

20 25 30 

Glu Val Asn Asp Arg Gin Gly Phe Val Pro Ala Ala Tyr Val 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



Tyr Asp Met Asp 
1 

Gly Gin Val Tyr 

20 



lie Gin Gin Glu 
5 

Ser Thr Leu Val 



Leu Gin Arg Tyr 
10 

Thr Tyr He Ser 
25 



Leu Ser Leu Val 
15 

Asp Trp Trp Thr 
30 



Leu Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin 
35 40 45 

Asp Trp Ala 
50 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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Phe Asp Tyr Lys Ala Gin Arg Glu Asp Glu Leu Thr Phe Thr Lys Ser 
1 S 10 15 

Ala lie lie Gin Asn Val Glu Lys Gin Asp Gly Gly Trp Trp Arg Gly 

20 25 30 

Asp Tyr Gly Gly Lys Lys Gin Leu Trp Phe Pro Ser Asn Tyr Val Glu 
35 40 45 

Glu Met He 
50 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Tyr Asp Met Asp He Gin Gin Glu Leu Gin Arg Tyr Leu Ser Leu Val 
1 5 10 15 

Gly Gin Val Tyr Ser Thr Leu Val Thr Tyr He Ser Asp Trp Trp Thr 

20 25 30 

Leu Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin 
35 40 45 

Asp Trp Ala Lys Arg Met Lys 
50 55 



(2) INFORMATION FOR SEQ ID NO: 22: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 22: 

He Gin Asp Tyr Glu Pro Arg Leu Thr Asp Glu He Arg He Ser Leu 
1 5 10 15 

Gly Glu Lys Val Lys He Leu Ala Thr His Thr Asp Gly Trp Cys Leu 

20 25 30 

Val Glu Lys Cys Asn Thr Arg Lys Gly Thr He His Val Ser Val Asp 
35 40 45 
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Asp Lys Arg Tyr Leu 
50 



(2) INFORriATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Tyr Asp Tyr Glu Ala Arg Thr Glu Asp Asp Leu Thr Phe Thr Lys Gly 
15 10 

Glu Lys Phe His He Leu Asn Asn Thr Glu Gly Asp Trp Trp Glu Ala 

20 25 30 

Arg Ser Leu Ser Ser Gly Lys Thr Gly Cys He Pro Ser Asn Tyr Val 
35 40 45 

Ala 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS; 

(D) TOPOLOGY: linear 

(xi) SEQXTCNCE DESCRIPTION: SEQ ID NO: 24: 

Thr Tyr Asp Phe Ser Phe Lys Ser Ser Val He Thr Leu Asn Thr Asn 
15 10 15 

Ala Glu Leu Phe Asn Gin Ser Asp He Val Ala His Leu Leu Ser Ser 

20 25 30 

Ser Ser Ser Val He Asp Ala Leu Gin Tyr Lys Leu Glu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
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Asp Phe Asn Tyr Pro lie Lys Lys Asp Ser Ser Ser Gin Leu Leu Ser 
1 5 10 15 

Val Gin Gin Gly Glu Thr lie Tyr lie Leu Asn Lys Asn Ser Ser Gly 

20 25 30 

Trp Trp Asp Gly Leu Val lie Asp Asp Ser Asn Gly Lys Val Asn 
35 40 45 

(2) INFORMATION FOR SEQ 10 NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Lys Tyr Asp Phe Asn Ser Ser Met Leu Tyr Ser Thr Ala Lys Gly Ala 
15 10 IS 

Val Asp His Lys Leu Ser Leu Glu Ser Leu Thr Ser Tyr Phe Ser He 

20 25 30 

Glu Ser Ser Thr Lys Gly Asp Val Lys Gly Ser Val Leu Ser Arg Glu 
35 40 45 

Tyr 



(2) INFORMATION FOR SEQ ID NO: 27: 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Glu Pro Tyr Val Ala He Lys Ala Tyr Thr Ala Val Glu Gly Asp Glu 
15 10 ^S 

Val Ser Leu Leu Glu Gly Glu Ala Val Glu Val He His Lys Leu Leu 

20 25 30 

Asp Gly Trp Trp Val He Arg Lys Asp Asp Val Thr Gly Tyr Phe Pro 
35 40 45 

Ser Met Tyr Leu 
50 
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(2) INFORMATION FOR SEQ ID NO: 28: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Leu Trp Asp Phe Leu Lys Leu Asp Val Thr Thr Ser lie Gly Arg Arg 
15 10 15 

Gin His Leu Arg Val Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn 

20 25 30 

Gly Tyr Ser Phe Ser lie Pro Val Lys Val Leu Ala Asp Lys Phe lie 
35 40 45 

Thr Pro Gly Leu Lys Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 amino acids 

(B) TYPE: amino acid 

( C ) STR7UHDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Leu Tyr Asp Phe Lys Ala Glu Lys Ala Asp Glu Leu Thr Thr Tyr Val 
15 10 15 

Gly Glu Asn Leu Phe lie Cys Ala His His Asn Cys Glu Trp Phe lie 

20 25 30 

Ala Lys Pro lie Gly Arg Leu Gly Gly Pro Gly Leu Val Pro Val Gly 
35 40 45 

Phe Val Ser lie lie Asp lie 
50 55 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
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Val Leu Tyr Asp Tyr Val Asn Lys Tyr His Trp Glu His Thr Gly Leu 
15 10 15 

Thr Leu Arg Glu Val Ser Ser Lys Leu Arg Arg Asn Leu Gin Asn Asn 

20 25 30 

Ala Glu Trp Val Tyr Gin Gly Ala lie Arg Gin He Asp Asp He 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Val Leu Tyr Asp Phe Lys Ala Glu Lys Ala Asp Glu Leu Thr Thr Tyr 
15 10 15 

Val Gly Glu Asn Leu Phe He Cys Ala His His Asn Cys Glu Trp Phe 

20 25 30 

He Ala Lys Pro He Gly Arg Leu 
35 40 



(2) INF0R^4ATI0N FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS; 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Lys Pro Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met Phe He Arg 
15 10 15 

Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val His Asn Gly 

20 25 30 

Ser Glu He Leu Phe Ser Tyr Phe Gin Asp Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 
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(D) TOPOLOGY; linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Leu Phe Gly Phe Val Pro Glu Thr Lys Glu Glu Leu Gin Val Met Pro 
15 10 15 

Gly Asn He Val Phe Val Leu Lys Lys Gly Asn Asp Asn Trp Ala Thr 

20 25 30 

Val Met Phe Asn Gly Gin Lys Gly Leu Val Pro Cys Asn Tyr Leu Glu 
35 40 45 

Pro Val Glu Leu 
50 



SEQUENCE DESCRIPTION; SEQ ID NO: 34: 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) 

Gly Lys Pro Gly He Tyr Thr Arg 
1 5 

Arg Glu Val Gly Thr Val Leu Ser 

20 

Gly Ser Glu He Leu Phe Ser Tyr 
35 40 



Glu Glu Leu Cys Thr Met Phe He 
10 15 

Gin Val Tyr Ser Lys Val His Asn 
25 30 

Phe Gin Asp 



(2) INFORMATION FOR SEQ ID NO; 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Ala Lys Phe Asp Tyr Val Ala Gin Gin Glu Gin Glu Leu Asp He Lys 
15 10 15 

Lys Asn Glu Arg Leu Trp Leu Leu Asp Asp Ser Lys Ser Trp Trp Arg 

20 25 30 

Val Arg Asn Ser Met Asn Lys Thr Gly Phe Val Pro Ser Asn Tyr Val 
35 40 45 
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Glu Arg Lys Asn 
50 



(2) INFORMATION FOR SEQ ID NO; 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 85 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

Trp Tyr His Ala Ser Leu Thr Arg Ala Gin Ala Glu His Met Leu Met 
15 10 15 

Arg Val Pro Arg Asp Gly Ala Phe Leu Val Arg Lys Arg Asn Glu Pro 

20 25 30 

Asn Ser Tyr Ala He Ser Phe Arg Ala Glu Gly Lys He Lys His Cys 
35 40 45 

Arg Val Gin Gin Glu Gly Thr Val Met Leu Gly Asn Ser Glu Phe Asp 
50 55 60 

Ser Leu Val Asp Leu He Ser Tyr Tyr Glu Lys His Pro Leu Tyr Arg 
65 70 75 80 

Lys Met Lys Leu Lys 

85 



(2) INFORMATION FOR SEQ ID NO : 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

Phe Phe Gly Glu Gly Thr Lys Lys Met Gly Leu Ala Phe Glu Ser Thr 
15 10 15 

Lys Ser Thr Ser Pro Pro Lys Gin Ala Glu Ala Val Leu Lys Thr Leu 

20 25 30 

Gin Glu Leu Lys Lys Leu Thr He Ser Glu Gin Asn He Gin Arg Ala 
35 40 45 

Asn Leu Phe Asn Lys Leu Val Thr Glu Leu Arg Gly Leu Ser Asp Glu 

50 55 60 
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Ala Val Thr Ser Leu Leu Pro Gin Leu lie Glu Val Ser Ser Pro lie 
65 70 75 80 

Thr Leu Gin Ala Leu Val Gin Cys Gly Gin Pro Cys Ser Thr His He 

85 90 95 

Leu Gin Trp Leu Lys Arg Val His Ala Asn 

100 105 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Trp Phe His Gly Lys He Ser Lys Gin Glu Ala Tyr Asn Leu Leu Met 
15 10 15 

Thr Val Gly Gin Ala Cys Ser Phe Leu Val Arg Pro Ser Asp Asn Thr 

20 25 30 

Pro Gly Asp Tyr Ser Leu Tyr Phe Arg Thr Ser Glu Asn He Gin Arg 
35 40 45 

Phe Lys He Cys Pro Thr Pro Asn Asn Gin Phe Met Met Gly Gly Arg 
50 55 60 

Tyr Tyr Asn Ser Ser He Gly Asp He He Asp His Tyr Arg Lys Glu 
65 70 75 80 

Gin He Val Glu Gly Tyr Tyr Leu Lys Glu Pro 

85 90 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOL/DGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

He Met Leu Ser Val Glu Lys Leu He Lys Asp Leu Lys Ser Lys Glu 
15 10 15 

Val Pro Glu Ala Arg Ala Tyr Leu Arg He Leu Gly Glu Glu Leu Gly 

20 25 30 
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Phe Ala Ser Leu His Asp Leu Gin Leu Leu Gly Lys Leu Leu Leu Met 
35 40 45 

Gly Ala Arg Thr Leu Gin Gly He Pro Gin Met He Gly Glu Val He 
50 55 60 

Arg Lys Gly Ser Lys Asn Asp Phe Phe Leu His Tyr He Phe Met Glu 
65 70 75 80 

Asn Ala Phe Glu Leu Pro Thr Gly Ala Gly Leu Gin Leu 

85 90 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Trp Phe His Gly Lys He Ser Lys Gin Glu Ala Tyr Asn Leu Leu Met 
15 10 15 

Thr Val Gly Gin Ala Cys Ser Phe Leu Val Arg Pro Ser Asp Asn Thr 

20 25 30 

Pro Gly Asp Tyr Ser Leu Tyr Phe Arg Thr Ser Glu Asn He Gin Arg 
35 40 45 

Phe Lys He Cys Pro Thr Pro Asn Asn Gin Phe Met Met Gly Gly Arg 
50 55 60 

Tyr Tyr Asn Ser Ser He Gly Asp He He Asp His Tyr Arg Lys Glu 
65 70 75 80 

Gin He Val Glu Gly Tyr Tyr Leu Lys 

85 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 41: 

Tyr Phe His Lys Leu Asn He Pro Lys Leu Asp Phe Ser Ser Gin Ala 
15 10 15 
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Asp Leu Arg Asn 

20 

Trp Thr Ser Ser 
35 

Ser Asp Glu Gly 
50 

Pro Leu Thr Ser 
65 



Glu lie Lys Thr 



Gly Lys Gly Ser 

40 

Thr His Glu Ser 
55 

Phe Gly Leu Ser 
70 



124- 

Leu Leu Lys Ala 
25 

Trp Lys Trp Ala 



Gin lie Ser Phe 

60 

Asn Lys lie Asn 
75 



Gly His lie Ala 
30 

Cys Pro Arg Phe 
45 

Thr He Glu Gly 



Ser 



(2) INFORMATION FOR SEQ ID NO: 42: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 amino acids 

(B) TYPE: amino acid 
( C } STRANDEDI^SS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Trp Tyr Trp Gly Asp He Ser Arg Glu Glu Val Asn Glu Lys Leu Arg 
15 10 15 

Asp Thr Pro Asp Gly Thr Phe Leu Val Arg Asp Ala Ser Ser Lys He 

20 25 30 

Gin Gly Asp Tyr Thr Leu Thr Leu Arg Lys Gly Gly Asn Asn Lys Leu 
35 40 45 

He Lys Val Phe His Arg Asp Gly Lys Tyr Gly Phe Ser Glu Pro Leu 
50 55 60 

Thr Phe Cys Ser Val Val Asp Leu He Thr His Tyr Arg His Glu Ser 
65 '70 75 80 

Leu Ala Gin Tyr Asn Ala Lys Leu Asp Thr Arg Leu Leu Tyr Pro Val 

85 90 95 

Ser Lys Tyr 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
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Phe Phe Ser Ala 
1 

Gly Asn Leu Lys 

20 

Phe Leu Asn Asn 
35 

Ser Trp Gin Val 

50 

Phe Ser Ala Gly 
65 

Asn Gly Glu Ala 



Gin Pro Phe Glu 
5 

Val Arg Phe Pro 



Tyr Ala Leu Phe 

40 

Ser Ala Arg Phe 

55 

Asn Asn Glu Asn 
70 

Asn Leu Asp Phe 
85 



He Thr Ala Ser 
10 

Leu Arg Leu Thr 
25 

Leu Ser Pro Ser 



Asn Gin Tyr Lys 

60 

He Met Glu Ala 
75 

Leu Asn He Pro 
90 



Thr Asn Asn Glu 
15 

Gly Lys He Asp 
30 

Ala Gin Gin Ala 
45 

Tyr Asn Gin Asn 



His Val Gly He 

80 

Leu Thr He Pro 
95 



Glu Met Arg Leu 

100 



(2) INFORMATION FOR SEQ ID NO: 44: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

{D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

Trp Phe His Gly Lys Leu Gly Ala Gly Arg Asp Gly Arg His He Ala 
15 10 15 

Glu Arg Leu Leu Thr Glu Tyr Cys He Glu Thr Gly Ala Pro Asp Gly 

20 25 30 



Ser Phe Leu Val 
35 

Ser Phe Trp Arg 
50 

Gin Asp Ala Gly 
65 

Asp Ser Leu Tyr 



Arg Glu Ser Glu 

40 

Asn Gly Lys Val 
55 

Thr Pro Lys Phe 
70 

Asp Leu He Thr 
85 



Thr Phe Val Gly 



Gin His Cys Arg 

60 

Phe Leu Thr Asp 
75 

His Tyr Gin Gin 
90 



Asp Tyr Thr Leu 
45 

He His Ser Arg 



Asn Leu Val Phe 

80 

Val Pro Leu Arg 
95 



Cys Asn Glu Phe Glu Met Arg Leu Ser Glu 

100 105 
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(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 91 amino acids 

(B) TYPE: amino acid 
<C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Phe Pro Gly Lys Pro Gly lie Tyr Thr Arg Glu Glu Leu Cys Thr Met 
15 10 15 

Phe He Arg Glu Val Gly Thr Val Leu Ser Gin Val Tyr Ser Lys Val 

20 25 30 

His Asn Gly Ser Glu He Leu Phe Ser Tyr Phe Gin Asp Leu Val He 
35 40 45 

Thr Leu Pro Phe Glu Leu Arg Lys His Lys Leu He Asp Val He Ser 
50 55 60 

Met Tyr Arg Glu Leu Leu Lys Asp Leu Ser Lys Glu Ala Gin Glu Val 
65 70 75 80 

Phe Lys Ala He Gin Ser Leu Lys Thr Thr Glu 

85 90 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

Val Ser Asp Gly He Ala Ala Leu Asp Leu Asn Ala Val Ala Asn Lys 
15 10 15 

He Ala Asp Phe Glu Leu Pro Thr He He Val Pro Glu Gin Thr He 

20 25 30 

Glu He Pro Ser He Lys Phe Ser Val Pro Ala Gly He Val He Pro 
35 40 45 

Ser Phe Gin Ala Leu Thr Ala Arg Phe Glu val Asp Ser Pro Val Tyr 
50 55 60 

Asn Ala Thr Trp Ser Ala Ser Leu Lys Asn Lys Ala Asp Tyr Val Glu 

65 70 75 80 

Thr Val Leu Asp Ser Thr Cys Ser Ser Thr Val Gin Phe Leu Glu Tyr 

85 90 95 
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Glu Leu Asn Val Leu Gly Thr His Lys He Glu Asp Gly Thr Leu Ala 

100 105 110 

Ser Lys Thr Lys Gly Thr Leu Ala His Arg Asp Phe Ser Ala Glu Tyr 
115 120 125 

Glu Glu Asp Gly Lys Phe Glu Gly Leu Gin Glu Trp Glu Gly Lys Ala 
130 135 140 

His Leu Asn He Lys Ser Pro Ala Phe Thr Asp Leu His Leu Arg Tyr 
145 150 155 160 

Gin Lys Asp Lys Lys Gly He Ser Thr Ser Ala Ala Ser Pro Ala Val 

165 170 175 

Gly Thr Val Gly Met Asp Met Asp Glu Asp Asp Asp Phe Ser Lys Trp 

180 185 190 

Asn Phe Tyr Tyr Ser Pro Gin Ser Ser Pro Asp 
195 200 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 214 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Leu Gly Gin Gly Cys Phe Gly Glu Val Trp Met Gly Thr Trp Asn Gly 
15 10 15 

Thr Thr Arg Val Ala He Lys Thr Leu Lys Pro Gly Thr Met Ser Pro 

20 25 30 

Glu Ala Phe Leu Gin Glu Ala Gin Val Met Lys Lys Leu Arg His Glu 
35 40 45 

Lys Leu Val Gin Leu Tyr Ala Val Val Ser Glu Glu Pro He Tyr He 
50 55 60 

Val Thr Glu Tyr Met Ser Lys Gly Ser Leu Leu Asp Phe Leu Lys Gly 
65 70 75 80 

Glu Thr Gly Lys Tyr Leu Arg Leu Pro Gin Leu Val Asp Met Ala TVla 

85 90 95 

Gin He Ala Ser Gly Met Ala Tyr Val Glu Arg Met Asn Tyr Val His 

100 105 HO 

Arg Asp Leu Arg Ala Ala Asn He Leu val Gly Glu Asn Leu Val Cys 
115 120 125 
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Lys Val Ala Asp 

130 

Thr Ala Arg Gin 
145 

Ala Ala Leu Tyr 



Gly lie Leu Leu 

180 

Gly Met Val Asn 
195 

Met Pro Cys Pro 

210 



Phe Gly Leu Ala 
135 

Gly Ala Lys Phe 
150 

Gly Arg Phe Thr 
165 

Thr Glu Leu Thr 



Arg Glu Val Leu 

200 

Pro Glu 



Arg Leu lie Glu 

140 

Pro lie Lys Trp 
155 

lie Lys Ser Asp 
170 

Thr Lys Gly Arg 
185 

Asp Gin Val Glu 



Asp Asn Glu Tyr 



Thr Ala Pro Glu 

160 

Val Trp Ser Phe 
175 

Val Pro Tyr Pro 
190 

Arg Gly Tyr Arg 
205 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Leu Gly Asn Gly Gin Phe Gly Glu Val Trp Met Gly Thr Trp Asn Gly 
15 10 15 

Asn Thr Lys Val Ala He Lys Thr Leu Lys Pro Gly Thr Met Ser Pro 

20 25 30 

Glu Ser Phe Leu Glu Glu Ala Gin He Met Lys Lys Leu Lys His Asp 
35 40 45 

Lys Leu Val Gin Leu Tyr Ala Val Val Ser Glu Glu Pro He Tyr He 
50 55 60 

Val Thr Glu Tyr Met Asn Lys Gly Ser Leu Leu Asp Phe Leu Lys Asp 
65 70 75 80 

Gly Glu Gly Arg Ala Leu Lys Leu Pro Asn Leu val Asp Met Ala Ala 

85 90 95 

Gin Val Ala Ala Gly Met Ala Tyr He Glu Arg Met Asn Tyr He His 

100 105 110 

Arg Asp Leu Arg Ser Ala Asn He Leu Val Gly Asn Gly Leu He Cys 
115 120 125 

Lys He Ala Asp Phe Gly Leu Ala Arg Leu He Glu Asp Asn Glu Tyr 
130 135 140 



SUBSTITUTE SHEET {RULE 26) 



wo 98/56938 



PCTAJS98/1 1927 



- 129- 



Thr Ala Arg Gin Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro Glu 
145 150 155 160 

Ala Ala Leu Tyr Gly Arg Phe Thr He Lys Ser Asp Val Trp Ser Phe 

165 170 175 

Gly He Leu Leu Thr Glu Leu val Thr Lys Gly Arg Val Pro Tyr Pro 

180 185 190 

Gly Met Asn Asn Arg Glu Val Leu Glu Gin Val Glu Arg Gly Tyr Arg 
195 200 205 

Met Pro Cys Pro Gin 
210 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

Leu Gly Ala Gly Gin Phe Gly Glu Val Trp Met Ala Thr Tyr Asn Lys 
15 10 15 

His Thr Lys Val Ala Val Lys Thr Met Lys Pro Gly Ser Met Ser Val 

20 25 30 

Glu Ala Phe Leu Ala Glu Ala Asn Val Met Lys Thr Leu Gin His Asp 
35 40 45 

Lys Leu Val Lys Leu His Ala Val Val Thr Lys Glu Pro He Tyr He 
50 55 60 

He Thr Glu Phe Met Ala Lys Gly Ser Leu Leu Asp Phe Leu Lys Ser 
65 70 75 80 

Asp Glu Gly Ser Lys Gin Pro Leu Pro Lys Leu He Asp Phe Ser Ala 

85 90 95 

Gin He Ala Glu Gly Met Ala Phe He Glu Gin Arg Asn Tyr He His 

100 105 110 

Arg Asp Leu Arg Ala Ala Asn He Leu Val Ser Ala Ser Leu Val Cys 
115 120 125 

Lys He Ala Asp Phe Gly Leu Ala Arg Val He Glu Asp Asn Glu Tyr 
130 135 140 

Thr Ala Arg Glu Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro Glu 
145 150 155 160 
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Ala He Asn Phe Gly Ser Phe Thr He Lys Ser Asp Val Trp Ser Phe 

165 170 175 

Gly He Leu Leu Met Glu He Val Thr Tyr Gly Arg He Pro Tyr Pro 

180 185 130 

Gly Met Ser Asn Pro Glu Val He Arg Ala Leu Glu Arg Gly Tyr Arg 
195 200 205 

Met Pro Arg Pro Glu 
210 



(2) INFORMATION FOR SEQ ID NO ; 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 218 amino acids 
{B> TYPE: amino acid 
( C > STRANDEDNES S : 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Leu Gly Ala Gly Gin Phe Gly Glu Val Trp Met Gly Tyr Tyr Asn Asn 
15 10 15 

Ser Thr Lys Val Ala Val Lys Thr Leu Lys Pro Gly Thr Met Ser Val 

20 25 30 

Gin Ala Phe Leu Glu Glu Ala Asn Leu Met Lys Thr Leu Gin His Asp 
35 40 45 

Lys Leu Val Arg Leu Tyr Ala Val Val Thr Arg Glu Glu Pro He Tyr 
50 55 60 

He He Thr Glu Tyr Met Ala Lys Gly Ser Leu Leu Asp Phe Leu Lys 
65 70 75 80 

Ser Asp Glu Gly Gly Lys Val Leu Leu Pro Lys Leu He Asp Phe Ser 

85 90 95 

Ala Gin He Ala Glu Gly Met Ala Tyr He Glu Arg Lys Asn Tyr He 

100 105 110 

His Arg Asp Leu Arg Ala Ala Asn Val Leu Val Ser Glu Ser Leu Met 
115 120 125 

Cys Lys He Ala Asp Phe Gly Leu Ala Arg Val He Glu Asp Asn Glu 
130 135 140 

Tyr Thr Ala Arg Glu Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro 
145 150 155 160 

Glu Ala He Asn Phe Gly Cys Phe Thr He Lys Ser Asp Val Trp Ser 

165 170 175 
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Phe Gly lie Leu 

180 

Pro Gly Arg Thr 
195 

Arg Met Pro Arg 
210 



Leu Tyr Glu lie 



Asn Ala Asp Val 

200 

Val Glu Asn Cys 
215 



Val Thr Tyr Gly 
185 

Met Thr Ala Leu 
Pro Asp 



Lys lie Pro Tyr 
190 

Ser Gin Gly Tyr 
205 



(2) INFORMATION FOR SEQ ID NO: 51: 



(i) SEQUENCE CH/^CTERISTICS : 

(A) LENGTH: 213 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Leu Gly Ala Gly Gin Phe Gly Glu Val Trp Met Gly Tyr Tyr Asn Gly 
15 10 15 

His Thr Lys Val Ala Val Lys Ser Leu Lys Gin Gly Ser Met ser Pro 

20 25 30 

Asp Ala Phe Leu Ala Glu Ala Asn Leu Met Lys Gin Leu Gin His Gin 
35 40 45 

Arg Leu Val Arg Leu Tyr Ala Val Val Thr Gin Glu Pro lie Tyr lie 
50 55 60 

lie Thr Glu Tyr Met Glu Asn Gly Ser Leu Val Asp Phe Leu Lys Thr 
65 70 75 80 

Pro Ser Gly lie Lys Leu Thr lie Asn Lys Leu Leu Asp Met Ala Ala 

85 90 95 

Gin lie Ala Glu Gly Met Ala Phe lie Glu Glu Arg Asn Tyr lie His 

100 105 110 

Arg Asp Leu Arg Ala Ala Asn lie Leu Val Ser Asp Thr Leu Ser Cys 
115 120 125 

Lys He Ala Asp Phe Gly Leu Ala Arg Leu He Glu Asp Asn Glu Tyr 
130 135 140 

Thr Ala Arg Glu Gly Ala Lys Phe Pro He Lys Trp Thr Ala Pro Glu 
145 150 155 160 

Ala He Asn Tyr Gly Thr Phe Thr He Lys Ser Asp Val Trp Ser Phe 

165 170 175 

Gly He Leu Leu Thr Glu He Val Thr His Gly Arg He Pro Tyr Pro 

180 185 190 
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Gly Met Thr Asn Pro Glu Val He Gin Asn Leu Glu Arg Gly Tyr Arg 
195 200 205 

Met Val Arg Pro Asp 
210 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Arg Lys Asn Tyr He His Arg Asp Leu Arg Ala Ala Asn 
X 5 10 



(2) INFORMATION FOR SEQ ID NO: 53: 

<i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Lys Gly Thr Leu Ala His Arg Asp Phe Ser Ala Glu 
15 10 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

. (C) STRANDEDNESS: 
(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Thr Lys Val Ala Val Lys Thr Leu Lys Pro Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 



SUBSTITUTE SHEET (RULE 26) 



PCT/US98/n927 

WO 98/56938 

. 133- 



(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Asp Lys val Ala lie Lys Thr lie Arg Glu Gly 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS ; 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Asp Leu Asn Ala val Ala Asn Lys lie Ala Asp 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

( D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Thr Ser Leu Arg Ala Pro Thr Met Pro Pro Pro Leu Pro Pro Val Pro 
15 10 15 

Pro Gin Pro Ala Arg Arg Gin Ser Arg Arg Leu Pro Ala Ser Pro Val 

20 25 30 

He Ser 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Ser Asp Ala Glu Gly Thr Ala Val Ala Pro Pro Thr Val Thr Pro Val 
X 5 10 15 
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Pro Ser Leu Glu Ala Pro Ser Glu Gin Ala Pro Thr Glu Gin Arg Pro 

20 25 30 

Gly Val Gin Glu 
35 



(2) INFORMATION FOR SEQ ID NO : 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS ; 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ser Asp Ala Glu Gly Thr Ala Val Ala Pro Pro Thr He Thr Pro He 
15 10 15 

Pro Ser Leu Glu Ala Pro Ser Glu Gin Ala Pro Thr Glu Gin Arg Pro 

20 25 30 

Gly Val Gin Glu 
35 



(2) INFORMATION FOR SEQ ID NO : 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Ser Asp Ala Glu Trp Thr Ala Phe Val Pro Pro Asn Val He Leu Ala 
15 10 15 

Pro Ser Leu Glu Ala Phe Phe Glu Gin Ala Leu Thr Glu Glu Thr Pro- 

20 25 30 

Gly Val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
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Leu val Thr Glu Ser Ser Val Leu Ala Thr Leu Thr Val Val Pro Asp 
1 5 10 15 

Pro Ser Thr Glu Ala Ser Ser Glu Glu Ala Pro Thr Glu Gin Ser Pro 

20 25 30 

Gly val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO; 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Pro Val Met Glu Ser Thr Leu Leu Thr Thr Pro Thr Val Val Pro Val 
1 5 10 15 

Pro Ser Thr Glu Leu Pro Ser Glu Glu Ala Pro Thr Glu Asn Ser Thr 

20 25 30 

Gly Val Gin Asp 
35 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNBS S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Pro Val Thr Glu Ser Ser Val Leu Thr Thr Pro Thr Val Ala Pro Val 
15 10 15 

Pro Ser Thr Glu Ala Pro Ser Glu Gin Ala Pro Pro Glu Lys Ser Pro 

20 25 30 

Val Val Gin Asp 

35 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Ser Glu Thr Glu Ser Gly Val Leu Glu Thr Pro Thr Val Val Pro Glu 
15 10 15 

Pro Ser Met Glu Ala His Ser Glu Ala Ala Pro Thr Glu Gin Thr Pro 

20 25 30 

Val Val Arg Gin 
35 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Ser Asp Thr Glu Ser Gly Thr Val Val Ala Pro Pro Thr Val He Gin 
15 10 15 

Val Pro Ser Leu Gly Pro Pro Ser Glu Gin Asp 

20 25 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Pro Lys Asp Ala Thr Arg Phe Lys His Leu Arg Lys Tyr Thr Tyr Asn 
15 10 15 

Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly Thr Ala Asp Ser Arg 

20 25 30 

Ser Ala Thr Arg He 
35 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 67: 

Pro Lys Asp Ala Ser Gin Arg Arg Arg Ser Leu Glu Pro Ala Glu Asn 
15 10 15 

Val His Gly Ala Gly Gly Gly Ala Phe Pro Ala Ser Gin Thr Pro Ser 

20 25 30 

Lys Pro 



<2> INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Asp Lys Glu Ala Thr Lys Leu Thr Glu Glu Arg Asp Gly Ser Leu Asn 
15 10 15 

Gin Ser Ser Gly Tyr Arg Tyr Gly Thr Asp Pro Thr Pro Gin His Tyr 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 69: 

lie Gin Asn Tyr His Thr Phe Leu lie Tyr He Thr Glu Leu Leu Lys 
15 10 15 

Lys Leu Gin Ser Thr Thr Val Met Asn Pro Tyr Met Lys Leu Ala Pro 

20 25 30 

Gly Glu Leu Thr He He Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 70: 
(i) SEQUENCE CHARACTERISTICS: 

SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCTAJS98/11927 



. 138- 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Pro Glu Glu Arg Pro Thr Phe Glu Tyr Leu Gin Ala Phe Leu Glu Asp 
15 10 15 

Tyr Phe Thr Ser Thr Glu Pro Gin Tyr Gin Pro Gly Glu Asn Leu 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Pro Glu Glu Arg Pro Thr Phe Glu Tyr Leu Gin Ser Phe Leu Glu Asp 
15 10 15 

Tyr Phe Thr Ala Thr Glu Pro Gin Tyr Gin Pro Gly Glu Asn Leu 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Pro Glu Glu Arg Pro Thr Phe Glu Tyr lie Gin Ser Val Leu Asp Asp 
15 10 15 

Phe Tyr Thr Ala Thr Glu Ser Gin Tyr Gin Gin Gin Pro 

20 25 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



- 139- 



PCTAJS98/11927 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 73: 

Ala Glu Glu Arg Pro Thr Phe Asp Tyr Leu Gin Ser Val Leu Asp Asp 
15 10 15 

Phe Tyr Thr Ala Thr Glu Gly Gin Tyr Gin Gin Gin Pro 

20 25 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Pro Glu Asp Arg Pro Thr Phe Asp Tyr Leu Arg Ser Val Leu Glu Asp 
15 10 15 

Phe Phe Thr Ala Thr Glu Gly Gin Tyr Gin Pro Gin Pro 

20 25 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Pro Xaa Xaa Xaa Xaa Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 40 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Pro Asp Phe Arg Leu Pro Glu lie Ala lie Pro Glu Phe lie He Pro 
15 10 15 

Thr Leu Asn Leu Asn Asp Phe Gin Val Pro Asp Leu His He Pro Glu 

20 25 30 
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Phe Gin Leu Pro His lie Ser His 
35 40 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Pro Gin Asn Ala Lys Leu Lys lie Lys Arg Pro Val Lys Val Gin Pro 
15 10 15 

lie Ala Arg Val Trp Tyr 

20 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Pro Asp Phe Arg Leu Pro Glu lie Ala He Pro Glu Phe He He Pro 
15 10 15 

Thr Leu Asn Leu Asn Asp 

20 



(2) INFORMATION FOR SEQ ID NO : 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Asn Asp Phe Gin Val Pro Asp Leu His He Pro Glu Phe Gin Leu Pro 
15 10 15 

His He Ser His Thr He 

20 
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(2) INFORMATION FOR SEQ ID NO: 80: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Pro Ser Leu Glu Leu Pro Val Leu His Val Pro Arg Asn Leu Lys Leu 
15 10 3.5 

Ser Leu Pro His Phe Lys 

20 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 379 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Met Ala Ser Gly Arg Ala Arg Cys Thr Arg Lys Leu Arg Asn Trp Val 
15 10 15 



Val Glu Gin Val 

20 

Thr Ala Lys Thr 
35 

Asp Phe Arg Glu 

50 

Phe Lys Gly Lys 
65 

Arg Met Asp Val 



Gly lie Val Ser 

100 

Gin His Ser Ser 
115 

Asn Cys Thr Leu 
130 

Glu Gly Ala Ser 
145 



Glu Ser Gly Gin 



Met Phe Arg lie 

40 

Ser Gin Asp Ala 
55 

Tyr Lys Glu Gly 
70 

Ala Glu Pro Tyr 
85 

Gly Gin Pro Gly 



Val Ser Ser Glu 

120 

Ser Pro Ser Val 
135 

Gly Gly Ala Val 
150 



Phe Pro Gly Val 
25 

Pro Trp Lys His 



Ala Phe Phe Lys 

60 

Asp Lys Glu Val 
75 

Lys Val Tyr Gin 
90 

Thr Gin Lys Val 
105 

Arg Lys Glu Glu 



Leu Gin Asp Ser 

140 

His Ser Asp lie 

155 



Cys Trp Asp Asp 
30 

Ala Gly Lys Gin 
45 

Ala Trp Ala lie 



Pro Glu Arg Gly 

80 

Leu Leu Pro Pro 
95 

Pro Ser Lys Arg 
110 

Asp Ala Met Gin 
125 

Leu Asn Asn Glu 



Gly Ser Ser Ser 

160 
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Ser Ser Ser Ser Pro Glu Pro Gin Glu Val Thr Asp Thr Thr Glu Ala 

165 170 175 

Pro Phe Gin Gly Asp Gin Arg Ser Leu Glu Phe Leu Leu Pro Pro Glu 

180 185 190 

Pro Asp Tyr Ser Leu Leu Leu Thr Phe lie Tyr Asn Gly Arg Val Val 
195 200 205 

Gly Glu Ala Gin Val Gin Ser Leu Asp Cys Arg Leu Val Ala Glu Pro 
210 215 220 

Ser Gly Ser Glu Ser Ser Met Glu Gin Val Leu Phe Pro Lys Pro Gly 
225 230 235 240 

Pro Glu Pro Thr Gin Arg Leu Leu Ser Gin Leu Glu Arg Gly He Leu 

245 250 255 

Val Ala Ser Asn Pro Arg Gly Leu Phe Val Gin Arg Leu Cys Pro He 

260 265 270 

Pro He Ser Trp Asn Ala Pro Gin Ala Pro Pro Gly Pro Gly Pro His 
275 280 285 

Leu Leu Pro Ser Asn Glu Cys Val Glu Leu Phe Arg Thr Ala Tyr Phe 
290 295 300 

Cys Arg Asp Leu Val Arg Tyr Phe Gin Gly Leu Gly Pro Pro Pro Lys 
305 310 315 320 

Phe Gin Val Thr Leu Asn Phe Trp Glu Glu Ser His Gly Ser Ser His 

325 330 335 

Thr Pro Gin Asn Leu He Thr Val Lys Met Glu Gin Ala Phe Ala Arg 

340 345 350 

Tyr Leu Lys Met Glu Gin Ala Phe Ala Arg Tyr Leu Leu Glu Gin Thr 
355 360 365 

Pro Glu Gin Gin Ala Ala He Leu Ser Leu Val 
370 375 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 383 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Val Ser Leu Val Cys Pro Lys Asp Ala Thr Arg Phe Lys His Leu Arg 
15 10 15 
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Lys Tyr Thr Tyr 

20 

Thr Ala Asp Ser 
35 

Glu Val Pro Gin 
50 

Leu Lys Glu Val 
65 

Lys Thr Lys Asn 



Leu Lys Leu Ala 

100 

Lys Asp Glu Pro 
115 

Ala Leu Leu Val 

130 

Leu Asp Thr Val 
145 

Arg Lys Gly Asn 



Gin Cys Asp Arg 

180 

Leu lie Lys Gly 

195 

Gin Ser Cys Gin 
210 

Ala lie Cys Lys 
225 

Lys Tyr Gly Met 



Thr Pro Lys lie 

260 

Gly Leu Ala Phe 
275 

Glu Ala Val Leu 

290 



Asn Tyr Glu Ala 



Arg Ser Ala Thr 

40 

Leu Cys Ser Phe 
55 

Tyr Gly Phe Asn 
70 

Ser Glu Glu Phe 
85 

lie Pro Glu Gly 



Thr Tyr lie Leu 

120 

Pro Pro Glu Thr 
135 

Tyr Gly Asn Cys 
150 

Val Ala Thr Glu 
165 

Phe Lys Pro lie 



Met Thr Arg Pro 

200 

Tyr Thr Leu Asp 
215 

Glu Gin His Leu 

230 

Val Ala Gin Val 
245 

Asn Ser Arg Phe 



Glu Ser Thr Lys 

280 

Lys Thr Leu Gin 
295 



Glu Ser Ser Ser 
25 

Arg lie Asn Cys 



lie Leu Lys Thr 

60 

Pro Glu Gly Lys 
75 

Ala Ala Ala Met 
90 

Lys Gin Val Phe 
105 

Asn lie Lys Arg 



Glu Glu Ala Lys 

140 

Ser Thr His Phe 
155 

He Ser Thr Glu 
170 

Arg Thr Gly He 
185 

Leu Ser Thr Leu 



Ala Lys Arg Lys 

220 

Phe Leu Pro Phe 

235 

Thr Gin Thr Leu 
250 

Phe Gly Glu Gly 
265 

Ser Thr Ser Pro 



Glu Leu Lys Lys 

300 



Gly Val Pro Gly 
30 

Lys Val Glu Leu 
45 

Ser Gin Cys Thr 



Ala Leu Leu Lys 

80 

Ser Arg Tyr Glu 
95 

Leu Tyr Pro Glu 
110 

Gly He He Ser 
125 

Gin Val Leu Phe 



Thr Val Lys Thr 

160 

Arg Asp Leu Gly 
175 

Ser Pro Leu Ala 
190 

He Ser Ser Ser 
205 

His Val Ala Glu 



Ser Tyr Lys Asn 

240 

Lys Leu Glu Asp 
255 

Thr Lys Lys Met 
270 

Pro Lys Gin Ala 
285 

Leu Thr He Ser 
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Glu GXn Asn lie Gin Arg Ala Asn Leu Phe Asn Lys Leu Val Thr Glu 
305 310 315 320 

Leu Arg Gly Leu Ser Asp Glu Ala Val Thr Ser Leu Leu Pro Gin Leu 

325 330 335 

He Glu Val Ser Ser Pro He Thr Leu Gin Ala Leu Val Gin Cys Gly 

340 345 350 

Gin Pro Gin Cys Ser Thr His He Leu Lys Arg Val His Ala Asn Pro 
355 360 365 

Leu Leu He Asp Val Val Thr Tyr Leu Val Ala Leu He Pro Glu 
370 375 380 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 94 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Phe Gly Leu Ser Asn Lys He Asn Ser Lys His Leu Arg Val Asn Gin 
15 10 15 

Asn Leu Val Tyr Glu Ser Gly Ser Leu Asn Phe Ser Lys Leu Glu He 

20 25 30 

Gin Ser Gin Val Asp Ser Gin His Val Gly His Ser Val Leu Thr Ala 
35 40 45 

Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala Glu Phe Thr Gly Arg 
50 55 60 

His Asp Ala His Leu Asn Gly Lys Val He Gly Thr Leu Lys Asn Ser 
65 70 75 80 

Leu Phe Phe Ser Ala Gin Pro Phe Glu He Thr Ala Ser Thr Asn Asn 

85 90 95 

Glu Gly Asn Leu Lys Val Arg Phe Pro Leu Arg Leu Thr Gly Lys He 

100 105 110 

Asp Phe Leu Asn Asn Tyr Ala Leu Phe Leu Ser Pro Ser Ala Gin Gin 
115 120 125 

Ala Ser Trp Gin Val Ser Ala Arg Phe Asn Gin Tyr Lys Tyr Asn Gin 
130 135 140 

Asn Phe Ser Ala Gly Asn Asn Glu Asn He Met Glu Ala His Val Gly 
145 150 155 160 
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He Asn Gly Glu 



Pro Glu Mec Arg 

X80 

Asp Phe Ser Leu 
195 

Thr Lys Gin Ser 
210 

Lys His Arg His 
225 

Ser Gin Ser He 



Asn Ala Leu Asp 

260 

Phe Asp Lys Tyr 
27S 

Phe Gin He Pro 
290 

Pro Phe Thr He 
305 

Val Ser Met Pro 



Ser Tyr Thr Leu 

340 

Pro Arg Asn Leu 
355 

He Ser His He 
370 

Ser Phe Lys Ser 
385 



Ala Asn Leu Asp 
165 

Leu Pro Tyr Thr 



Trp Glu Lys Thr 

200 

Phe Asp Leu Ser 
215 

Ser He Asn Pro 
230 

Lys Ser Phe Asp 
245 

Phe Val Thr Lys 



Lys Ala Glu Lys 

280 

Gly Tyr Thr Val 
295 

Glu Met Ser Ala 

310 

Ser Phe Ser He 
325 

He Leu Pro Ser 



Lys Leu Ser Leu 

360 

Phe He Pro Ala 
375 

Ser Val He Thr 
390 



145- 

Phe Leu Asn He 
170 

He He Thr Thr 
185 

Gly Leu Lys Glu 



Val Lys Ala Gin 

220 

Leu Ala Val Leu 
235 

Arg His Phe Glu 
250 

Ser Tyr Asn Glu 

265 

Ser His Asp Glu 



Pro Val Val Asn 

300 

Phe Gly Tyr Val 
315 

Leu Gly Ser Asp 
330 

Leu Glu Leu Pro 
345 

Pro His Phe Lys 



Met Gly Asn He 

380 

Leu Asn 



Pro Leu Thr He 
175 

Pro Pro Leu Lys 
190 

Phe Leu Lys Thr 
205 

Tyr Lys Lys Asn 



Cys Glu Phe He 

240 

Lys Asn Arg Asn 
255 

Thr Lys He Lys 
270 

Leu Pro Arg Thr 
285 

Val Glu Val Ser 



Phe Pro Lys Ala 

320 

Val Arg Val Pro 
335 

Val Leu His Val 
350 

Glu Leu Cys Thr 
365 

Thr Tyr Asp Phe 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 51 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS ; 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO; 84: 
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Met Ala Ser Gly Arg Ala Arg Cys Thr Arg Lys Leu Arg Asn Trp Val 
15 10 15 

Val Glu Gin Val Glu Ser Gly Gin Phe Pro Gly Val Cys Trp Asp Asp 

20 25 30 

Thr Ala Lys Thr Met Phe Arg He Pro Trp Lys His Ala Gly Lys Gin 
35 40 45 

Asp Phe Arg 
50 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inea r 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Pro Lys Asp Ala Thr Arg Phe Lys His Leu Arg Lys Tyr Thr Tyr Asn 
15 10 15 

Tyr Glu Ala Glu Ser Ser Ser Gly Val Pro Gly Thr Ala Asp Ser Arg 

20 25 30 

Ser Ala Thr Arg He Asn Cys Lys Val Glu Leu Glu Val Leu Pro Gin 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 37 amino acids 
(B) TYPE: amino acid 
{ C ) S TRANDEDNES S : 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Pro Glu Gly Lys Ala Leu Leu Lys Lys Thr Lys Asn Ser Glu Glu Phe 
15 10 15 

Ala Ala Ala Met Ser Arg Tyr Glu Leu Lys Leu Ala He Pro Glu Gly 

20 25 30 

Lys Gin Val Phe Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 87: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Cys Ser Thr His Phe Thr Val Lys Thr Arg Lys Gly Asn Val Ala Thr 
15 10 15 

Glu lie Ser Thr Glu Arg Asp Leu Gly Gin Cys Asp Arg Phe Lys Pro 

20 25 30 

lie Arg Thr Gly He Ser 
35 



(2) INFORMATION FOR SEQ ID NO : 88; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Cys Ser Thr His He Leu Gin Trp Leu Lys Arg Val His Ala Asn Pro 
15 10 15 

Leu Leu He Asp Val Val Thr Tyr Leu Val Ala Leu He Pro Glu Pro 

20 25 30 

Ser Ala Gin Gin Leu Arg Glu He Phe Asn Met Ala Arg Asp Gin Arg 
35 40 45 

Ser Arg Ala 
50 



(2) INFORMATION FOR SEQ ID NO : 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

His Leu Ser Cys Asp Thr Lys Glu Glu Arg Lys He Lys Gly Val He 
15 10 15 

Ser He Pro Arg Leu Gin Ala Glu Ala Arg Ser Glu He Leu Ala His 

20 25 30 
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Trp Ser Pro Ala Lys Leu 
35 



(2) INFORMATION FOR SEQ ID NO; 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 4 7 amino acids 
. (B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Val His Leu Asp Ser Lys Lys Lys Gin His Leu Phe Val Lys Glu 
15 10 15 

Val Lys He Asp Gly Gin Phe Arg Val Ser Ser Phe Tyr Ala Lys Gly 

20 25 30 

Thr Tyr Gly Leu Ser Cys Gin Arg Asp Pro Asn Thr Gly Arg Leu 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala Ala Leu 
15 10 15 

Gly Lys Leu Pro Gin Gin Ala Asn Asp Tyr Leu Ser Phe Asn Trp Glu 

20 25 30 

Arg Gin Val Ser His Ala Lys Glu 
35 40 



{2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 4 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS; 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Lys Leu Thr Ala Leu Thr Lys Lys Tyr Arg He Thr Glu Asn Asp He 
15 10 15 

SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCT/US98/11927 

- 149- 



Gln He Ala Leu Asp Asp Ala Lys He Asn Phe Asn Glu Lys Leu Ser 

20 25 30 

Gin Leu Gin Thr Tyr Met He Gin 
35 40 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 93: 

Glu Arg lie Asn Asp Val Leu Glu His Val Lys His Phe Val He Asn 
15 10 15 

Leu He Gly Asp Phe Glu Val Ala Glu Lys He Asn Ala Phe Arg Ala 

20 25 30 

Lys Val His Glu Leu He Glu Arg Tyr Glu Val Asp Gin Gin He Gin 
35 40 45 

Val Leu 
50 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNES S : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Asn Lys Phe Leu Asp Met Leu He Lys Lys Leu Lys Ser Phe Asp Tyr 
1 5 10 15 

His Gin Phe Val Asp Glu Thr Asn Asp Lys He Arg Glu Val Thr Gin 

20 25 30 

Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu Pro Gin Lys Ala Glu 
35 40 45 



Ala Leu 
50 
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(2) INFORMATION FOR SEQ ID NO: 95; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Ser Asn Lys He Asn Ser Lys His Leu Arg Val Asn Gin Asn Leu Val 
1 5 10 15 

Tyr Glu Ser Gly Ser Leu Asn 

20 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 96: 

Phe Ser Lys Leu Glu He Gin Ser Gin Val Asp Ser Gin His Val Gly 
15 10 15 

His Ser Val Leu Thr Ala Lys Gly Met Ala Leu Phe Gly Glu Gly Gly 

20 25 30 

Lys Ala Glu Phe Thr Gly Arg His Asp Ala His Leu Asn Gly Lys 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Val Lys Ala Gin Tyr Lys Lys Asn Lys His Arg His Ser He Thr Asn 
15 10 15 
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Pro Leu Ala Val Leu Cys Glu Phe 

20 

Asp Arg His Phe Glu Lys Asn Arg 
35 40 



He Ser Gin Ser He Lys Ser Phe 
25 30 

Asn Asn Ala Leu Asp Phe Val Thr 

45 
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Lys Ser 
50 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys Arg Gly Leu Lys Leu 
15 10 15 

Ala Thr Ala Leu Ser Leu Ser Asn Lys Phe Val Glu Gly Ser His Asn 

20 25 30 

Ser Thr Val Ser Leu Thr Thr Lys Asn Met Glu Val Ser Val Ala Lys 
35 40 45 

Thr Thr Lys 
50 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Lys Leu Asp Val Thr Thr Ser He Gly Arg Arg Gin His Leu Arg Val 
15 10 15 

Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn Gly Tyr Ser Phe Ser 

20 25 30 

He Pro Val Lys Val Leu Ala Asp Lys Phe He Thr Pro Gly Leu Lys 
35 40 45 

Leu Asn Asp 
SO 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Phe Arg Glu He Gin He Tyr Lys Lys Leu Arg Thr Ser Ser Phe Ala 
15 10 15 

Leu Asn Leu Pro Thr Leu Pro Glu Val Lys Phe Pro Glu Val Asp Val 

20 25 30 

Leu Thr Lys Tyr Ser Gin Pro Glu Asp Ser Leu He Pro Phe Phe Glu 
35 40 45 

He 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Leu His Leu Arg Tyr Gin Lys Asp Lys Lys Gly He Ser Thr Ser Ala 
15 10 15 

Ala Ser Pro Ala Val Gly Thr Val Gly Met Asp Met Asp Glu Asp Asp 

20 25 30 

Asp Phe Ser Lys Trp Asn Phe Tyr Tyr Ser Pro Gin Ser Ser Pro Asp 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Leu Arg Glu Val Ser Ser Lys Leu Arg Arg Asn Leu Gin Asn Asn Ala 
15 10 15 

Glu Trp Val Tyr Gin Gly Ala He Arg Gin He Asp Asp He Asp Val 

20 25 30 

Arg Phe Gin Lys Ala Ala Ser Gly Thr Thr Gly Thr Tyr Gin Glu Trp 
35 40 45 



SUBSTITUTE SHEET (RULE 26) 



XVO 98/56938 PCT/US98/11927 

- 153- 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STR3\NDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Arg Val Thr Gin Lys Phe His Met Lys Val Lys His Leu He Asp Ser 
15 10 15 

Leu He Asp Phe Leu Asn Phe Pro Arg Phe Gin Phe Pro Gly Lys Pro 

20 25 30 

Gly He Tyr Thr Arg Glu Glu Leu Cys Thr Met Phe He Arg Glu Val 
35 40 45 

Gly Thr 
50 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Trp Lys His Ala Gly Lys Gin Asp Phe Arg Glu Ser Gin Asp Ala Ala 
1 5 10 15 

Phe Phe Lys Ala Trp Ala He Phe Lys Gly Lys Tyr Lys Glu Gly Asp 

20 25 30 

Lys Glu Val Pro Glu Arg Gly Arg Met Asp Val Ala Glu Pro Tyr Lys 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Glu His Val Lys His Phe Val He Asn Leu He Gly Asp Phe Glu Val 
15 10 15 
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Ala Glu Lys He Asn Ala Phe Arg Ala Lys Val His Glu Leu He Glu 

20 25 30 

Arg Tyr Glu Val Asp Gin Gin He Gin Val Leu Met Asp Lys Leu Val 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106; 

Val Arg Lys Tyr Arg Ala Ala Leu Gly Lys Leu Pro Gin Gin Ala Asn 
15 10 15 

Asp Tyr Leu Asn Ser Phe Asn Trp Glu Arg Gin Val Ser His Ala Lys 

20 25 30 

Glu Lys Leu Thr Ala Leu Thr Lys Lys Tyr Arg He Thr Glu Asn Asp 
35 40 45 

He Gin He Ala 
50 



<2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107; 

Tyr He Lys Asp Ser Tyr Asp Leu His Asp Leu Lys He Ala He Ala 
15 10 15 

Asn He He Asp Glu He He Glu Lys Leu Lys Ser Leu Asp Glu His 

20 25 30 

Tyr His He Arg Val Asn Leu Val Lys Thr He His Asp Leu His Leu 
35 40 45 

Phe He Glu Asn He Asp Phe Asn Lys 
50 55 
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(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Lys lie Thr Leu He He Asn Trp Leu Gin Glu Ala Leu Ser Ser Ala 
15 10 15 

Ser Leu Ala His Met Lys Ala Lys Phe Arg Glu Thr Leu Glu Asp Thr 

20 25 30 

Arg 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Thr Asp His Phe Ser Leu Arg Ala Arg Tyr His Met Lys Ala Asp Ser 
15 10 15 

Val Val Asp Leu Ser Tyr Asn Val Gin Gly Ser Gly Glu Thr Thr Tyr 

20 25 30 



(2) INFORMATION FOR SEQ ID NO : 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Lys Leu Thr Thr Asn Gly Arg Phe Arg Glu His Asn Ala Lys Phe Ser 
15 10 15 

Leu Asp Gly Lys 

20 
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(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

Asp Thr Lys Tyr Gin He Arg He Gin He Gin Glu Lys Leu Gin Gin 
15 10 15 

Leu Lys Arg His He Gin Asn He Asp He Gin His Leu Ala Gly Lys 

20 25 30 

Leu Lys Gin His He Glu Ala He Asp Val Arg Val Leu Leu Asp Gin 
35 40 45 

Leu Gly Thr Thr 
50 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Phe His Asp Phe Pro Asp Leu Gly Gin Glu Val Ala Leu Asn Ala Asn 
15 10 15 

Thr Lys Asn Gin Lys He Arg Trp Lys Asn Glu Val Arg He His Ser 

20 25 30 

Gly Ser Phe Gin Ser Gin Val Glu Leu Ser Asn Asp Gin 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe Hie 
15 10 15 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCT/US98/11927 

- 157- 

Met Lys Val Lys His Leu He Asp Ser Leu He Asp Phe Leu Asn Phe 

20 25 30 

Pro Arg 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 114: 

His Arg Asn He Gin Glu Tyr Leu Ser He Leu Thr Asp Pro Asp Gly 
15 10 15 

Lys Gly Lys Glu Lys He Ala Glu Leu Ser Ala Thr Ala Gin Glu He 

20 25 30 

He Lys Ser 
35 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 211 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Glu Phe Thr He Val Ala Phe Val Lys Tyr Asp Lys Asn Gin Asp Val 
15 10 15 

His Ser He Asn Leu Pro Phe Phe Glu Thr Leu Gin Glu Tyr Phe Glu 

20 25 30 

Arg Asn Arg Gin Thr He He Val Val Leu Glu Asn Val Gin Arg Lys 
35 40 45 

Leu Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala Ala 
50 55 60 

Leu Gly Lys Leu Pro Gin Gin Ala Asn Asp Tyr Leu Asn Ser Phe Asn 
65 70 75 80 

Trp Glu Arg Gin Val Ser His Ala Lys Glu Lys Leu Thr Ala Leu Thr 

8 5 90 95 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



l»ys Lys Tyr Arg 

100 

Ala Lys He Asn 
115 

lie Gin Phe Asp 
130 

Lys He Ala He 
145 

Ser Leu Asp Glu 



His Asp Leu His 

180 

Ser Ser Thr Ala 
195 



He Thr Glu Asn 



Phe Asn Glu Lys 

120 

Gin Tyr He Lys 
135 

Ala Asn He He 
150 

His Tyr His He 
165 

Leu Phe He Glu 



Ser Trp He Gin 

200 



158- 

Asp He Gin He 
105 

Leu Ser Gin Leu 



Asp Ser Tyr Asp 

140 

Asp Glu He He 
155 

Arg Val He Leu 
170 

Asn He Asp Phe 
185 

Asn Val Asp Thr 



Ala Leu Asp Asp 
110 

Gin Thr Tyr Met 
125 

Leu His Asp Leu 



Glu Lys Leu Lys 

160 

Val Lys Thr He 
175 

Asn Lys Ser Gly 
190 

Lys Tyr Gin He 
205 



Arg He Gin 
210 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Gly Pro Leu Pro Thr Leu Val Ser Gly Gly Thr He Leu Ala Thr Val 
15 10 15 

Pro Leu Val Val Asp Ala Glu Lys Leu Pro He Asn Arg Leu Ala Ala 

20 25 30 

Gly Ser Lys Ala Pro Ala Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr 
35 40 45 

Ala His Asn Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys 
50 55 60 

He He Glu Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn 
65 70 75 80 

Lys Ser Ala Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin 

85 90 95 

His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala 

100 105 110 
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Val His Lys Ser 
115 

Gly Gly Asn Thr 
130 

Asp Thr Leu Thr 
145 

Ser Pro Leu Ser 



Lys Ser Leu Lys 

120 

Asp Val Leu Met 
135 

Pro Pro Pro Ser 
150 

Leu Gly Ser Arg 
165 



159- 

Asp Leu Val Ser 



Glu Gly Val Lys 

140 

Asp Ala Gly Ser 
155 

Gly Ser Gly Ser 
170 



Ala Cys Gly Ser 
125 

Thr Glu Val Glu 

Pro Phe Gin Ser 

160 

Gly Gly 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 172 amino acids 
(B> TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Gin Val Pro Thr Leu Val Gly Ser Ser Gly Thr He Leu Thr Thr Met 
15 10 15 

Pro Val Met Met Gly Gin Glu Lys Val Pro He Lys Gin Val Pro Gly 

20 25 30 

Gly Val Lys Gin Leu Glu Pro Pro Lys Glu Gly Glu Arg Arg Thr Thr 
35 40 45 

His Asn He He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He 
50 55 60 

He Glu Leu Lys Asp Leu Val Met Gly Thr Asp Ala Lys Met His Lys 
65 70 75 80 

Ser Gly Val Leu Arg Lys Ala He Asp Tyr He Lys Tyr Leu Gin Gin 

85 90 95 

Val Asn His Lys Leu Arg Gin Glu Asn Met Val Leu Lys Leu Ala Asn 

100 105 110 

Gin Lys Asn Lys Leu Leu Lys Gly He Asp Leu Gly Ser Leu Val Asp 
115 120 125 

Asn Glu Val Asp Leu Lys He Glu Asp Phe Asn Gin Asn Val Leu Leu 
130 135 140 

Met Ser Pro Pro Ala Ser Asp Ser Gly Ser Gin Ala Gly Phe Ser Pro 
145 150 155 160 

Tyr Ser He Asp Ser Glu Pro Gly Ser Pro Leu Leu 

165 170 
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(2) INFORMATION FOR SBQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Gly Pro Leu Gin Thr Leu Val Ser Gly Gly Thr He Leu Ala Thr Val 
15 10 15 

Pro Leu Val Val Asp Thr Asp Lys Leu Pro He His Arg Leu Ala Ala 

20 25 30 

Gly Gly Lys Ala Leu Gly Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr 
35 40 45 

Ala His Asn Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys 
50 55 60 

He Val Glu Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn 
65 70 75 80 

Lys Ser Ala Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin 

85 90 95 

His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu Thr Leu Arg Ser Ala 

100 105 110 

His Lys Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys Gly Ser Gly 
115 120 125 

Gly Gly Thr Asp Val Ser Met Glu Gly Met Lys Pro Glu Val Val Glu 
130 135 140 

Thr Leu Thr Pro Pro Pro Ser Asp Ala Gly Ser Pro Ser Gin Ser Ser 
145 150 155 160 

Pro Leu Ser Leu Gly Ser Arg Gly Ser Ser Ser Gly Gly 

165 170 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119 
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Asp Glu Pro Pro Gin Ser Pro Trp Asp Arg Val Lys Asp Leu Ala Thr 
15 10 15 

Val Tyr Val Asp Val Leu Lys Asp Ser Gly Arg Asp Tyr Val Ser Gin 

20 25 30 

Phe Glu Gly Ser Ala Leu Gly Lys Gin Leu Asn Leu Lys Leu Leu Asp 
35 40 45 

Asn Trp Asp Ser Val Thr Ser Thr Phe Ser Lys Leu Arg Glu Gin Leu 
50 55 60 

Gly Pro Val Thr Gin Glu Phe Trp Asp Asn Leu Glu Lys Glu Thr Glu 
65 70 75 80 

Gly Leu Arg Gin Glu Met Ser Lys Asp Leu Glu Glu Val Lys Ala Lys 

85 90 95 

Val Gin Pro Tyr Leu Asp Asp Phe Gin Lys Lys Trp Gin Glu Glu Met 

100 105 110 

Glu Leu Tyr Arg Gin Lys Val Glu Pro Leu Arg Ala Glu Leu Gin Glu 
115 120 125 

Gly Ala Arg Gin Lys Leu His Glu Leu Gin Glu Lys Leu Ser Pro Leu 
130 135 140 

Gly Glu Glu Met Arg Asp Arg Ala Arg Ala His Val Asp Ala Leu Arg 
145 150 155 160 

Thr His Leu Ala Pro Tyr Ser Asp Glu Leu Arg Gin Arg Leu Ala Ala 

165 170 175 

Arg Leu Glu Ala Leu Lys Glu Asn Gly Gly Ala Arg Leu Ala Glu Tyr 

180 185 190 

His Ala Lys Ala Thr Glu His Leu Ser Thr Leu Ser Glu Lys Ala Lys 
195 200 205 

Pro Ala Leu Glu Asp Leu Arg Gin Gly Leu Leu Pro Val Leu Glu Ser 
210 215 220 

Phe Lys Val Ser Phe Leu Ser Ala Leu Glu Glu Tyr Thr Lys Lys Leu 
225 230 235 240 



Asn Thr Gin 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 268 amino acids 

(B) TYPE: amino acid 
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( C ) STR2\NDEDNES S ; 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Gin Gin Val Pro Val Leu Leu Gin Pro His Phe lie Lys Ala Asp Ser 
15 10 15 

Leu Leu Leu Thr Ala Met Lys Thr Asp Gly Ala Thr Val Lys Ala Ala 

20 25 30 

Gly Leu Ser Pro Leu Val Ser Gly Thr Thr Val Gin Thr Gly Pro Leu 
35 40 45 

Pro Thr Leu Val Ser Gly Gly Thr lie Leu Ala Thr Val Pro Leu Val 
50 55 60 

Val Asp Ala Glu Lys Leu Pro He Asn Arg Leu Ala Ala Gly Ser Lys 
65 70 75 80 

Ala Pro Ala Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn 

85 90 95 

Ala He Glu Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu 

100 105 HO 

Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Ala 
115 120 125 

Val Leu Arg Lys Ala He Asp Tyr He Arg Phe Leu Gin His Ser Asn 
130 135 140 

Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala Val His Lys 
145 150 155 160 

Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys Gly Ser Gly Gly Asn 

165 170 175 

Thr Asp Val Leu Met Glu Gly Val Lys Thr Glu Val Glu Asp Thr Leu 

180 185 190 

Thr Pro Pro Pro Ser Asp Ala Gly Ser Pro Phe Gin Ser Ser Pro Leu 
195 200 205 

Ser Leu Gly Ser Arg Gly Ser Gly Ser Gly Gly Ser Gly Ser Asp Ser 
210 215 220 

Glu Pro Asp Ser Pro Val Phe Glu Asp Ser Lys Ala Lys Pro Glu Gin 
225 230 235 240 

Arg Pro Ser Leu His Ser Arg Gly Met Leu Asp Arg Ser Arg Leu Ala 

245 250 255 

Leu Cys Thr Leu Val Phe Leu Cys Leu Ser Cys Asn 

260 265 
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(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Gin Ala Lys Glu Pro Cys Val Glu Ser Leu Val Ser Gin Tyr Phe Gin 
1 5 10 15 

Thr Val Thr Asp Tyr Gly Lys Asp Leu Met Glu Lys Val Lys Ser Pro 

20 25 30 

Glu Leu Gin Ala Glu Ala Lys Ser Tyr Phe Glu Lys Ser Lys Glu Gin 
35 40 45 

Leu Thr Pro Leu He Lys Lys Ala Gly Thr Glu Leu Val Asn Phe Leu 
50 55 60 

Ser Tyr Phe Val Glu Leu Gly Thr Gin Pro Ala Thr Gin 
65 70 75 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 71 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Glu Ala Lys Leu Asn Lys Ser Ala Val Leu Arg Lys Ala He Asp Tyr 
1 5 10 15 

lie Arg Phe Leu Gin His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu 

20 25 30 

Ser Leu Arg Thr Ala Val His Lys Ser Lys Ser Leu Lys Asp Leu Val 
35 40 45 

Ser Ala Cys Gly Ser Gly Gly Asn Thr Asp Val Leu Met Glu Gly Val 
50 55 60 

Lys Thr Glu Val Glu Asp Thr 
65 70 



(2) INFORMATION FOR SEQ ID NO: 12 3: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 397 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 123: 

Gin Lys Ser Glu Leu Thr Gin Gin Leu Asn Ala Leu Phe Gin Asp Lys 
15 10 15 

Leu Gly Glu Val Asn Thr Tyr Ala Gly Asp Leu Gin Lys Lys Leu Val 

20 25 30 

Pro Phe Ala Thr Glu Leu His Glu Arg Leu Ala Lys Asp Ser Glu Lys 
35 40 45 

Leu Lys Glu Glu He Gly Lys Glu Leu Glu Glu Leu Arg Ala Arg Leu 
50 55 60 

Leu Pro His Ala Asn Glu Val Ser Gin Lys He Gly Asp Asn Leu Arg 
65 70 75 80 

Glu Leu Gin Gin Arg Leu Glu Pro Tyr Ala Asp Gin Leu Arg Thr Gin 

85 90 95 

Val Asn Thr Gin Ala Glu Gin Leu Arg Arg Gin Leu Asp Pro Leu Ala 

100 105 110 

Gin Arg Met Glu Arg Val Leu Arg Glu Asn Ala Asp Ser Leu Gin Ala 
115 120 125 

Ser Leu Arg Pro His Ala Asp Glu Leu Lys Ala Lys He Asp Gin Asn 
130 135 140 

Val Glu Glu Leu Lys Gly Arg Leu Thr Pro Tyr Ala Asp Glu Phe Lys 
145 150 155 160 

Val Lys He Asp Gin Thr Val Glu Glu Leu Arg Arg Ser Leu Ala Pro 

165 170 175 

Tyr Ala Gin Asp Thr Gin Glu Lys Leu Asn His Gin Leu Glu Gly Leu 

IBO 185 190 

Thr Phe Gin Met Lys Lys Asn Ala Glu Glu Leu Lys Ala Arg He Ser 
195 200 205 

Ala Ser Ala Glu He Asp Gin Thr Val Glu Glu Leu Arg Arg Ser Leu 
210 215 220 

Ala Pro Tyr Ala Gin Asp Thr Gin Glu Lys Leu Asn His Gin Leu Glu 
225 230 235 240 

Gly Leu Thr Phe Gin Met Lys Lys Asn Ala Glu Glu Leu Lys Ala Arg 

245 250 255 
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Ile Ser Ala Ser Ala Glu Glu Leu Arg Gin Arg Leu Ala Pro Leu Ala 

260 265 270 

Glu Asp Val Arg Gly Asn Leu Lys Gly Asn Thr Glu Gly Leu Gin Lys 
275 280 285 

Ser Leu Ala Glu Leu Gly Gly His Leu Asp Gin Gin Val Glu Glu Phe 
290 295 300 

Arg Arg Arg Val Glu Pro Tyr Gly Glu Asn Phe Asn Lys Ala Leu Val 
305 310 315 320 

Gin Gin Met Glu Gin Leu Arg Gin Lys Leu Gly Pro His Ala Gly Asp 

325 330 335 

Val Glu Gly His Leu Ser Phe Leu Glu Lys Asp Leu Arg Asp Lys Val 

340 345 350 

Asn Ser Phe Phe Ser Thr Phe Lys Glu Lys Glu Ser Gin Asp Lys Thr 
355 360 365 

Leu Ser Leu Pro Glu Leu Glu Gin Gin Gin Glu Gin Gin Gin Glu Gin 
370 375 380 

Gin Gin Glu Gin Val Gin Met Leu Ala Pro Leu Glu Ser 
385 390 395 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 422 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Glu Lys Leu Pro lie Asn Arg Leu Ala Ala Gly Ser Lys Ala Pro Ala 
1 5 10 15 

Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn Ala lie Glu 

20 25 30 

Lys Arg Tyr Arg Ser Ser lie Asn Asp Lys He He Glu Leu Lys Asp 
35 40 45 

Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Ala Val Leu Arg 
50 55 60 

Lys Ala He Asp Tyr He Arg Phe Leu Gin His Ser Asn Gin Lys Leu 
65 70 75 80 

Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala Val His Lys Ser Lys Ser 

85 90 95 
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Leu Lys Asp Leu Val Ser Ala Cys Gly Ser Gly Gly Asn Thr Asp Val 

100 105 110 

Leu Met Glu Gly Val Lys Thr Glu Val Glu Asp Thr Leu Thr Pro Pro 
115 120 125 

Pro Arg Asp Ala Gly Ser Pro Phe Gin Ser Ser Pro Leu Ser Leu Gly 
130 135 140 

Ser Arg Gly Ser Gly Ser Gly Gly Ser Gly Ser Asp Ser Glu Pro Asp 
145 150 155 160 

Ser Pro Val Phe Glu Asp Ser Lys Ala Lys Pro Glu Gin Arg Pro Ser 

165 170 175 

Leu His Ser Arg Gly Met Leu Asp Arg Ser Arg Leu Ala Leu Cys Thr 

180 185 190 

Leu Val Phe Leu Cys Leu Ser Cys Asn Pro Leu Ala Ser Leu Leu Gly 
195 200 205 

Ala Arg Gly Leu Pro Ser Pro Ser Asp Thr Thr Ser Val Tyr His Ser 
210 215 220 

Pro Gly Arg Asn Val Leu Gly Thr Glu Ser Arg Asp Gly Pro Gly Trp 
225 230 235 240 

Ala Gin Ala Val Gin Leu Phe Leu Cys Asp Leu Leu Leu Val Val Arg 

245 250 255 

Thr Ser Leu Trp Arg Gin Gin Gin Pro Pro Ala Pro Ala Pro Ala Ala 

260 265 270 

Gin Gly Ala Ser Ser Arg Pro Gin Ala Ser Ala Leu Glu lie Arg Gly 
275 280 285 

Phe Gin Arg Asp Leu Ser Ser Leu Arg Arg Leu Ala Gin Ser Phe Arg 
290 295 300 

Pro Ala Met Arg Arg Val Phe Leu His Glu Ala Thr Ala Arg Leu Met 
305 310 315 320 

Ala Gly Ala Ser Pro Thr Arg Thr His Gin Leu Leu Asp Arg Ser Leu 

325 330 335 

Arg Arg Arg Ala Gly Pro Gly Gly Lys Gly Gly Ala Val Ala Glu Leu 

340 345 350 

Glu Pro Arg Pro Thr Arg Arg Glu His Ala Glu Ala Leu Leu Leu Ala 
355 360 365 

Ser Cys Tyr Leu Pro Pro Gly Phe Leu Ser Ala Pro Gly Gin Arg Val 
370 375 380 

Gly Met Leu Ala Glu Ala Ala Arg Thr Leu Glu Lys Leu Gly Asp Arg 
385 390 395 400 
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Arg Leu Leu His Asp Cys Gin Gin Met Leu Met Arg Leu Gly Gly Gly 

405 410 415 

Thr Thr Val Thr Ser Ser 

420 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 142 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Glu Lys Met Ser Leu Arg Asn Arg Leu Ser Lys Ser Arg Glu Asn Pro 
15 10 15 

Glu Glu Asp Glu Asp Gin Arg Asn Pro Ala Lys Glu Ser Leu Glu Thr 

20 25 30 

Pro Ser Asn Gly Arg lie Asp lie Lys Gin Leu He Ala Lys Lys He 
35 40 45 

Lys Leu Thr Ala Asn Gly Arg He Asp He Lys Gin Leu He Ala Lys 
50 55 60 

Lys He Lys Leu Thr Ala Glu Asn Gly Arg He Asp He Lys Gin Leu 
65 70 75 80 

He Ala Lys Lys He Lys Leu Thr Ala Glu Ala Glu Glu Leu Lys Pro 

85 90 95 

Phe Phe Met Lys Glu Val Gly Ser His Phe Asp Asp Phe Val Thr Asn 

100 105 110 

Leu He Glu Lys Ser Ala Ser Leu Asp Asn Lys Ala His Ser Phe Val 
115 120 125 

Arg Glu Asn Val Pro Arg Val Leu Asn Ser Ala Lys Glu Lys 
130 135 140 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126; 
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Glu Lys Leu Pro He Asn Arg Leu Ala Ala Gly Ser Lys Ala Pro Ala 
15 10 15 

Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn Ala He Glu 

20 25 30 

Lys Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu Leu Lys Asp 
35 40 45 

Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Tyr He Arg Phe 
50 55 60 

Leu Gin His Ser Asn Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg 
65 70 75 80 

Thr Ala Val His Lys Ser Lys Ser Leu Lys Asp Leu Val Ser Ala Cys 

85 90 95 

Gly Ser Gly Gly Asn Thr Asp Val Leu Met Glu Gly Val Lys Thr Glu 

100 105 110 

Val Glu Asp Lys Ala Lys Pro Glu Gin Arg Pro Ser Leu His Ser Arg 
115 120 125 

Gly Met Leu Asp Arg Ser Arg 
130 135 



(2) INFORMATION FOR SEQ ID NO; 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Arg Arg His Cys Pro Leu Lys Asn Pro Thr Phe Leu Asp Tyr Val Arg 
1 5 .10 15 

Pro Arg Ser Trp Thr Cys Arg Tyr Val Phe 

20 25 



(2) INFORMATION FOR SEQ ID NO; 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 
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Arg Arg Arg Ala Gly Pro Gly Gly Lys Gly Gly Ala Val Ala Glu Leu 
1 5 10 15 



Glu Pro Arg Pro Thr Arg Arg Glu His 

20 25 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO; 129: 

Ala Met Leu Gly Gin Ser Thr Glu Glu Leu Arg Val Arg Leu Ala Ser 
1 5 10 15 

His Leu Arg Lys Leu Arg Lys Arg Leu Leu Arg Asp Ala Asp Asp Leu 

20 25 30 

Gin Lys Arg Leu Ala Val Tyr Gin Ala Gly Ala Arg Glu Gly Ala Glu 
35 40 45 

Arg Gly Leu Ser Ala He Arg Glu Arg Leu Gly Pro Leu Val Glu Gin 
50 55 60 

Gly Arg Val Arg Ala Ala Thr Val Gly Ser Leu Ala Gly Gin Pro Leu 
65 70 75 80 

Gin Glu Arg Ala Gin Ala Trp Gly Glu Arg Leu Arg Ala Arg Met Glu 

85 90 95 

Glu Met Gly Ser Arg Thr Arg Asp Arg Leu Asp Glu Val Lys Glu Gin 

100 105 110 



Val Ala 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 130: 

Lys Leu Pro He Asn Arg Leu Ala Ala Gly Ser Lys Ala Pro Ala Ser 
1 5 10 15 . 
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Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn Ala He Glu Lys 

20 25 ^0 

Arg Tyr Arg Ser Ser He Asn Asp Lys He He Glu Leu Lys Asp Leu 
35 40 45 

Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser Ala Val Leu Arg Lys 
50 55 60 

Ala He Asp Tyr He Arg Phe Leu Gin His Ser Asn Gin Lys Leu Lys 
65 70 75 80 

Gin Glu Asn Leu Ser Leu Arg Thr Ala Val His Lys Ser Lys Ser Leu 

85 90 95 

Lys Asp Leu Val Ser Ala Cys Gly Ser Gly Gly 

100 105 



(2) INFORMATION FOR SEQ ID NO : 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 131: 

Thr Gin Gin Pro Gin Gin Asp Glu Met Pro Ser Pro Thr Phe Leu Thr 
15 10 15 

Gin Val Lys Glu Ser Leu Ser Ser Tyr Trp Glu Ser Ala Lys Thr Ala 

20 25 30 

Ala Gin Asn Leu Tyr Glu Lys Thr Tyr Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Ser Gin He Gin Gin Val Pro Val Leu Leu Gin Pro His Phe He Lys 
15 10 15 

Ala Asp Ser Leu Leu Leu Thr Ala Met Lys Thr Asp Gly Ala Thr Val 

20 25 30 
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Lys Ala Ala Gly Leu Ser Pro Leu Val Ser Gly Thr Thr 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE; amino acid 

|C) STRANDEDNESS : 

{ D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

Ser Leu Leu Ser Phe Met Gin Gly Tyr Met Lys His Ala Thr Lys Thr 
15 10 15 

Ala Lys Asp Ala Leu Ser Ser Val Gin Glu Ser Gin Val Ala Gin Gin 

20 25 30 

Ala Arg Gly Trp Val Thr Asp Gly Phe Ser Ser Leu Lys 
35 40 45 



(2) INFORMATION FOR SEQ ID NO : 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Ala Pro Ala Ser Ala Gin Ser Arg Gly Glu Lys Arg Thr Ala His Asn 
15 10 15 

Ala lie Glu Lys Arg Tyr Arg Ser Ser lie Asn Asp Lys lie lie Glu 

20 25 30 

Leu Lys Asp Leu Val Val Gly Thr Glu Ala Lys Leu Asn Lys Ser 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Asp Tyr Trp Ser Thr Val Lys Asp Lys Phe Ser Glu Phe Trp Asp Leu 
15 10 15 
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Asp Pro Glu Val Arg Pro Thr Ser Ala Val Ala Ala 

20 25 



(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Glu lie Tyr Val Ala Ala Ala Leu Arg Val Lys Thr Ser Leu Pro Arg 
15 10 15 

Ala Leu His Phe Leu Thr Arg Phe Phe Leu Ser Ser Ala Arg Gin Ala 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Glu Lys lie Pro Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

Glu Lys Leu Pro lie 
1 5 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 
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(D) TOPOLOGY: linear 
(xi> SEQUENCE DESCRIPTION: SBQ ID NO: 139: 

Glu Asn Gly Arg Cys lie Gin Ala Asn Tyr Ser Leu Met Glu Asn Gly 
15 10 15 

Lys lie Lys Val Leu Asn Gin Glu Leu Arg Ala Asp Gly 

20 25 



(2) INFORMATION FOR SEQ ID NO : 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: . 

Ala Val Leu Arg Lys Ala lie Asp Tyr lie Arg Phe Leu Gin His Ser 
15 10 15 

Asn Gin Lys Leu Lys Gin Glu Asn Leu Ser Leu Arg Thr Ala Val 

20 25 30 



(2) INFORMATION FOR SEQ ID NO : 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Met Lys Gin Leu Glu Asp Lys Val Glu Glu Leu Leu Ser Lys Asn Tyr 
15 10 15 

His Leu Glu Asn Glu Val Ala Arg Leu Lys Lys Leu Val Gly Glu Arg 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
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(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Arg He Gin He Gin Glu Lys Leu Gin Gin Leu Lys Arg His He Gin 
1 5 10 15 

Asn He Asp He Gin His Leu Ala Gly Lys Leu Lys Gin His He Glu 

20 25 30 



(2) INFORMATION FOR SEQ ID NO; 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Val Leu Gin Gin Val Lys He Lys Asp Tyr Phe Glu Lys Leu Val Gly 
15 10 15 

Phe He Asp Asp Ala Val Lys Lys Leu Asn Glu Leu Ser Phe Lys Thr 

20 25 30 

Phe He Glu 
35 



(2) INF0Rr4ATI0N FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Glu Leu Ser Phe Lys Thr Phe He Glu Asp Val Asn Lys Phe Leu Asp 
15 10 15 

Met Leu He Lys Lys Leu Lys Ser Phe Asp Tyr His Gin Phe Val 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

His Gin Phe Val Asp Glu Thr Asn Asp Lys He Arg Glu Val Thr Gin 
15 10 15 

Arg Leu Asn Gly Glu He Gin Ala Leu Glu Leu Pro 

20 25 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

Ala Ala Lys Asn Leu Thr Asp Phe Ala Glu Gin Tyr Ser He Gin Asp 
15 10 15 

Trp Ala Lys Arg Met Lys Ala Leu Val Glu Gin Gly Phe Thr Val 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 9: 

Ser Ala Ser Leu Ala His Met Lys Ala Lys Phe Arg Glu Thr Leu Glu 
15 10 15 

Asp Thr Arg Asp Arg Met Tyr Asp Met Asp He Gin Gin Glu Leu Gin 

20 25 30 

Arg Tyr Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



SUBSTrrUTE SHEET (RULE 25) 



wo 98/56938 



PCTAJS98/11927 



. 176- 

(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(Dl TOPOLOGY: linear 

(xi) SEQXJENCE DESCRIPTION: SEQ ID NO: 149: 

Ser Ala Ser Leu Ala His Met Lys Ala Lys Phe Arg Glu Thr Leu Glu 
15 10 15 

Asp Thr Arg Asp Arg Met Tyr Asp Met Asp lie Gin Gin Glu Leu Gin 

20 25 30 

Arg Tyr Leu 
35 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Cys Leu Asn Leu His Lys Phe Asn Glu Phe lie Gin Asn Glu Leu Gin 
15 10 15 

Glu Ala Ser Gin Glu Leu Gin Gin lie His Gin Tyr lie Met Ala Leu 

20 25 30 

Arg Glu Glu 
35 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Phe Leu lie Tyr lie Thr Glu Leu Leu Lys Lys Leu Gin Ser Thr Thr 
15 10 15 

Val Met Asn Pro Tyr Met Lys Leu Ala Pro Gly Glu Leu Thr He He 

20 25 30 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCTAJS98/11927 



- 177- 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Arg Leu Leu Asp His Arg Val Pro Glu Thr Asp Met Thr Phe Arg His 
15 10 15 

Val Gly Ser Lys Leu lie Val Ala Met Ser Ser Trp Leu Gin 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Leu Asn Phe Ser Lys Leu Glu lie Gin Ser Gin Val Asp Ser Gin His 
15 10 i5 

Val Gly His Ser Val Leu Thr Ala Lys Gly Met Ala Leu Phe 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Asn Gin Asn Phe Ser Ala Gly Asn Asn Glu Asn lie Met Glu Ala His 
15 10 15 

Val Gly lie Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn lie 

20 25 30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Met Val Val Thr Arg He Ala Pro Ser Pro Thr Gly Asp Pro His Val 
15 10 15 

Gly Thr Ala Tyr He Ala Leu Phe Asn Tyr Ala Trp Ala 

20 25 



(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Thr Thr Val His Thr Arg Phe Pro Pro Glu Pro Asn Gly Tyr Leu His 
15 10 15 

He Gly His Ala Lys Ser He Cys Leu Asn Phe Gly He Ala 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

Lys He Lys Leu Tyr Cys Gly Val Asp Pro Thr Ala Gin Ser Leu His 
15 10 15 

Leu Gly Asn Leu Val Pro Met Val Leu Leu His Phe Tyr Val 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Pro lie Ala Leu Tyr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His 
15 10 15 



Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Gly Gin 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 159: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUEI^CE DESCRIPTION: SEQ ID NO: 159: 



Arg Val Thr Leu 
1 

lie Gly Asn Leu 

20 



Tyr Cys Gly Phe 
5 

Ala Ala lie Leu 



Asp Pro Thr Ala 

10 

Thr Leu Arg Arg 
25 



Asp Ser Leu His 
15 

Phe Gin 
30 



(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRAITOEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

Arg lie Gly Ala Tyr Val Gly lie Asp Pro Thr Ala Pro Ser Leu His 
15 10 15 

Val Gly His Leu Leu Pro Leu Met Pro Leu Phe Trp Met Tyr 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
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Pro lie Ala Leu Tyr Cys Gly Phe 
1 5 

Leu Gly His Leu Val Pro Leu Leu 

20 



180- 

Asp Pro Thr Ala Asp Ser Leu His 
10 15 

Cys Leu Lys Arg Phe Gin 
25 30 



(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

Pro Leu Lys Val Lys Leu Gly Ala Asp Pro Thr Ala Pro Asp lie His 
15 10 15 

Leu Gly His Thr Val Val Leu Asn Lys Leu Arg Gin Phe Gin 

20 25 30 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

Val Ser Lys Gly Leu Leu lie Phe Asp Ala Ser Ser Ser Met Gly Pro 
15 10 15 

Gin Met Ser Ala Ser Val His Leu Asp Ser Lys Lys Lys Gin His Leu 

20 25 30 

Phe Val Lys Glu Val Lys lie Asp Gly Gin Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 

Thr lie lie Thr Thr Pro Pro Leu Lys Asp Phe Ser Leu Trp Glu Lys 
15 10 15 
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Thr Gly Leu Lys Glu Phe Leu Lys Thr Thr Lys Gin Ser Phe Asp Leu 

20 25 30 

Ser Val Lys Ala Gin Tyr Lys Lys Asn Lys His 
35 40 



(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 

Lys Asn Arg Asn Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr Asn Glu 
15 10 15 

Thr Lys lie Lys Phe Asp Lys Tyr Lys Ala Glu Lys Ser Gin Asp Glu 

20 25 30 

Leu Pro Arg Thr Phe Gin lie 
35 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

Asp Ala Leu Gin Tyr Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys 
15 10 15 

Arg Gly Leu Lys Leu Ala Thr Ala Leu Ser Leu Ser Asn Lys Phe Val 

20 25 30 

Glu Gly Ser His 

35 



(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNBSS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

Arg Ala Phe Gly Trp Glu Ala Pro Arg Glu Tyr His Met Pro Leu Leu 
15 10 15 

Arg Asn Pro Asp Lys Thr Lys He Ser Lys Arg Lys Ser His Thr Ser 

20 25 30 

Leu Asp Trp Tyr Lys Ala Glu Gly Phe Leu 
35 40 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Asp Asn He Thr He Pro Val His Pro Arg Gin Tyr Glu Phe Ser Arg 
15 10 15 

Leu Asn Leu Glu Tyr Thr Val Met Ser Lys Arg Lys Leu Asn Leu Leu 

20 25 30 

Val Thr Asp Lys His Val Glu Gly Trp Asp 
35 40 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

Lys Asn Lys Gly Leu Pro Phe Gly He Thr Val Pro Leu Leu Thr Thr 
15 10 15 

Ala Thr Gly Glu Lys Phe Gly Lys Ser Ala Gly Asn Ala Val Phe He 

20 25 30 

Asp Pro Ser He Asn Thr Ala Tyr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOIiCXSY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

Arg Leu His Gin Asn Gin Val Phe Gly Leu Thr Val Pro Leu lie Thr 
15 10 15 

Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Gly Gly Ala Val Trp 

20 25 30 

Leu Asp Pro Lys Lys Thr Ser Pro Tyr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 



Lys Thr Lys Gly Glu Ala Arg Ala Phe Gly Leu Thr He Pro Leu Val 
15 10 15 

Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Ser Gly Thr He 

20 25 30 

Trp Leu Asp Lys Glu Lys Thr Ser Pro Tyr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Lys Thr Ala Leu Asp Glu Cys Val Gly Phe Thr Val Pro Leu Leu Thr 
15 10 15 

Asp Ser Ser Gly Ala Lys Phe Gly Lys Ser Ala Gly Asn Ala He Trp 

20 25 30 

Leu Asp Pro Tyr Gin Thr Ser Val Phe 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 173 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

Arg Leu His Gin Asn Gin Val Phe Gly Leu Thr Val Pro Leu lie Thr 
15 10 15 

Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu Gly Gly Ala Val Trp 

20 25 30 

Leu Asp Pro Lys Lys Thr Ser Pro Tyr 
35 40 



(2) INFORMATION FOR SEQ ID NO: 174; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

Ser Ala Gly Lys Lys Pro Gin Val Ala He Thr Leu Pro Leu Leu Val 
15 10 15 

Gly Leu Asp Gly Glu Lys Lys Met Ser Lys Ser Leu Gly Asn Tyr He 

20 25 30 

Gly Val Thr Glu Ala Pro Ser Asp Met Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

Arg Val Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn Gly Tyr Ser 
15 10 15 

Phe Ser He Pro Val Lys Val Leu Ala Asp Lys Phe He Thr Pro Gly 

20 25 30 
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Leu Iiys Leu 
35 



(2) INFORiVlATlON FOR SEQ ID NO: 176; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

Lys Leu Gly Gin Gly Cys Phe Gly Glu Val Trp Met Gly Thr Trp Asn 
15 10 15 

Gly Thr Thr Arg Val Ala lie Lys Thr Leu Lys Pro Gly 

20 25 



(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177 

His lie Gly His 
1 



(2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

His Lys Asn Thr Ser Thr Leu Ser Cys Asp Gly Ser Leu Arg His Lys 
IS 10 15 

Phe 



(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 
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(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

Arg Lys Leu Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg 
1 5 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 

Arg His He Gin Asn He Asp He Gin His Leu Ala Gly Lys Leu Lys 
1 5 10 15 

Gin His 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inea r 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

Lys Lys Gly Phe Tyr Lys Lys Lys Gin Cys Arg Pro Ser Lys Gly Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

{ D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
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Lys Lys Pro Leu Asp Gly Glu Tyr Phe Thr Leu Gin He Arg Gly Arg 
1 5 10 15 



Glu Arg 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

Lys Arg Ala Leu Pro Asn Asn Thr Ser Ser Ser Pro Gin Pro Lys Lys 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids' 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Lys Lys Thr Asn Leu Phe Ser Ala Leu He Lys Lys Lys Lys Lys Thr 
15 10 15 

Ala 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

Arg Lys Thr Leu Leu Asn Ser Leu Glu Glu Ala Lys Lys Lys Lys Glu 
15 10 15 

Asp 
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(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

Arg Arg Glu Leu Asp Glu Ser Leu Gin Val Ala Glu Arg Leu Thr Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

Arg Arg Ser Tyr Ala Leu Val Ser Leu Ser Phe Phe Arg Lys Leu Arg 
15 10 15 

Leu 



(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

Arg Arg Tyr Gly Asp Glu Glu Leu His Leu Cys Val Ser Arg Lys His 
15 10 15 

Phe 
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(2) INFORMATION FOR SEQ ID NO: 189: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

Lys Arg Val Ala Lys Arg Lys Leu lie Glu Gin Asn Arg Glu Arg Arg 
15 10 15 

Arg 



(2) INFORMATION FOR SEQ ID NO: 190: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

His Arg Ser Thr Asn Ala Gin Gly Ser His Trp Lys Gin Arg Arg Lys 
15 10 15 

Phe 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

Lys Arg Pro Pro lie Ser Asp Ser Glu Glu Leu Ser Ala Lys Lys Arg 
IS 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Lys Lys Gly Lys Lys Pro Lys Thr Glu Lys Glu Asp Lys Val Lys His 
1 5 10 15 

lie 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Arg Lys Arg Met Arg Asn Arg lie Ala Ala Ser Lys Cys Arg Lys Arg 
15 10 15 

Lys 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

Arg His He Gin Asn He Asp He Gin His Leu Ala Gly Lys Leu Lys 
15 10 15 

Gin His 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 
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Lys Lys He Thr Glu Val Ala Leu Met Gly His Leu Ser Cys Asp Thr 
15 10 15 

Lys Glu Glu Arg Lys 

20 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 196: 

Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOIjOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

His Arg Asn He Gin Glu Tyr Leu Ser He Leu Thr Asp Pro Asp Gly 
15 10 15 

Lys Gly Lys Glu Lys 

20 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

Lys Glu Val Tyr Gly Phe Asn Pro Glu Gly Lys Ala Leu Leu Lys Lys 
15 10 15 

Thr Lys 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCT/US98/11927 



- 192- 



(2) INFORMATION FOR SEQ ID NO: 199 



(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 



Lys Val Leu Val Asp His Phe Gly Tyr Thr Lys Asp Asp Lys His Glu 
15 10 15 

Asp Met 



(2) INFORMATION FOR SEQ ID NO : 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 



Lys Ala Gly Lys Leu Lys Phe He He Pro Ser Pro Lys Arg Pro Val 
15 10 15 



Lys Leu 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 

Arg Gin Val Ser His Ala Lys Glu Lys Leu Thr Ala Leu Thr Lys Lys 
15 10 15 

Tyr Arg 



(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 18 amino acids 



SUBSTITUTE SHEET (RULE 26) 



wo 98/S6938 



PCT/US98/11927 



-193- 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Lys Tyr Gin He Arg He Gin He Gin Glu Lys Leu Gin Gin Leu Lys 
15 10 15 

Arg His 



(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 

Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala Glu Phe Thr Gly Arg 
15 10 15 

His Asp Ala His 

20 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Lys Gin Ser Phe Asp Leu Ser Val Lys Ala Gin Tyr Lys Lys Asn Lys 
15 10 15 

His Arg 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
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Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys Arg Gly Leu Lys 
1 5 10 15 



(2) INFORMATION FOR SEQ ID NO; 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 

Lys Leu Asp Val Thr Thr Ser He Gly Arg Arg Gin His Leu Arg 
1 5 10 1-5 



(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ- ID NO: 207: 

Lys Leu Asp Phe Arg Glu He Gin He Tyr Lys Lys Leu Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 

Lys Ser Pro Ala Thr Asp Leu His Leu Arg Tyr Gin Lys Asp Lys Lys 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209 
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Lys Tyr His Trp Glu His Thr Gly Leu Thr Leu Arg Glu Val Ser Ser 
15 10 15 

Lys Leu Arg Arg 

20 



(2) INFORMATION FOR SEQ ID NO; 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His 
15 10 15 

Met Lys Val Lys His 

20 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

Ser lie Asn Leu Pro Phe Phe Glu Thr Leu Gin Glu Tyr Phe Glu Arg 
15 10 15 

Asn Arg Gin Thr He He Val Val Leu Glu Asn Val Gin Arg Lys Leu 

20 25 30 

Lys His He Asn He Asp Gin Phe Val Arg Lys Tyr Arg Ala Ala Leu 
35 40 45 

Gly Lys Leu Pro Gin Gin Ala Asn Asp Tyr Leu Asn Ser Phe Asn Trp 
50 55 60 

Glu Arg Gin Val Ser His Ala Lys Glu Lys Leu Thr Ala Leu Thr Lys 
65 70 75 80 

Lys Tyr Arg He Thr Glu Asn Asp He Gin He Ala Leu Asp Asp Ala 

85 90 95 

Lys He Asn Phe Asn Glu Lys Leu Ser Gin Leu Gin Thr Tyr Met He 

100 105 110 
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Gin Phe Asp Gin Tyr He bys Asp Ser Tyr Asp Leu His Asp Leu Lys 
115 120 125 

He Ala He Ala Asn He lie Asp Glu He He Glu Lys Leu Lys Ser 
130 135 140 

Leu Asp Glu His Tyr His He Arg Val He Leu Val Lys Thr He His 
145 150 155 160 

Asp Leu His Leu Phe He Glu Asn He Asp Phe Asn Lys Ser Gly Ser 

165 170 175 

Ser Thr Ala Ser 

IBO 



(2) INFORMATION FOR SEO ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 94 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 

Pro Gin Gin Val Asn Asp Tyr Leu Ser Thr Phe Ser Trp Glu Arg Gin 
15 10 15 

Val Leu Ser Ala Lys Lys Lys His Ser Asp Phe Met Glu Asp Tyr Arg 

20 25 30 

He Thr Glu Asn Asp Val Arg He Ala Leu Asp Asn Ala Lys He Asn 
35 40 45 

Leu Asn Glu Lys Leu Thr Gin Leu Gin Thr Tyr Val He Gin Phe Asp 
50 55 60 

Gin Tyr He Lys Asp Asn Tyr Asp Leu His Asp Phe Lys Thr Ala He 
65 70 75 80 

Ala Arg He He Asp Glu He He Ala Thr Leu Lys He Leu 

85 90 



(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 amino acids 

(B) TYPE: amino acid 
{ C ) STRANDEDNES S : 

( D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213 
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Lys Tyr Arg Val 
1 

Leu Asn Ala Ser 

20 

Leu Thr Ser Phe 
35 

lie Ala Leu Asp 
50 

Leu Glu Thr Tyr 
65 



Ala Leu Ser Arg 
5 

Asp Trp Glu Arg 



Met Glu Asn Tyr 

40 

Ser Ala Lys He 
55 

Ala He Gin Phe 
70 
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Leu Pro Gin Gin 
10 

Gin Val Ala Gly 
25 

Arg He Thr Asp 



Asn Leu Asn Glu 

60 

Asp Gin Tyr He 
75 



He His Asp Tyr 
15 

Ala Lys Glu Lys 
30 

Asn Asp Val Leu 
45 

Lys Leu Ser Gin 



Arg Asp Asn Tyr 

80 



Asp Ala Gin Asp Leu 

85 



(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 

Leu Asn Asp Phe Gin Val Pro Asp Leu His He Pro Glu Phe Gin Leu 
15 10 15 

Pro His He Ser His Thr He Glu Val Pro Thr Phe Gly Lys Leu Tyr 

20 25 30 

Ser He Leu Lys He Gin Ser Pro Leu Phe Thr Leu Asp Ala Asn Ala 
35 40 45 

Asp He Gly Asn Gly Thr Thr Ser Ala Asn Glu Ala Gly He Ala Ala 
50 55 60 

Ser He Thr Ala Lys Gly Glu Ser Lys Leu Glu Val Leu Asn Phe Asp 
65 70 75 80 

Phe Gin Ala Asn Ala Gin Leu Ser Asn Pro Lys He Asn Pro Leu Ala 

85 90 95 

Leu Lys Glu Ser Val Lys Phe Ser Ser Lys Tyr Leu Arg Thr Glu His 

100 105 110 

Gly Ser Glu Met Leu Phe Phe Gly Asn Ala He Glu Gly Lys Ser Asn 
115 120 125 

Thr Val Ala Ser Leu His Thr Glu Lys Asn Thr Leu Glu Leu Ser Asn 
130 135 140 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 



PCT/US98/11927 



.198- 

Gly Val He Val Lys He Asn Asn Gin Leu Thr Leu Asp Ser Asn Thr 
145 ISO 155 160 

Lys Tyr Phe His Lys Leu Asn He Pro Lys Leu Asp Phe Ser Ser Gin 

165 170 175 

Ala Asp Leu Arg Asn Glu He Lys Thr Leu Leu Lys Ala Gly His He 

180 185 190 

Ala Trp Thr Ser Ser Gly Lys Gly Ser Trp Lys Trp Ala Cys Pro Arg 
195 200 205 

Phe Ser Asp Glu Gly Thr His Glu Ser Gin He Ser Phe Thr He Glu 
210 215 220 

Gly Pro Leu Thr Ser Phe Gly Leu Ser Asn Lys He Asn Ser Lys His 
225 230 235 240 

Leu Arg Val Asn Gin Asn Leu Val Tyr Glu Ser Gly Ser Leu Asn Phe 

245 250 255 

Ser Lys Leu Glu He Gin Ser Gin Val Asp Ser Gin His Val Gly His 

260 265 270 

Ser Val Leu Thr Ala Lys Gly Met Ala Leu Phe Gly Glu Gly Lys Ala 
275 280 285 

Glu Phe Thr Gly Arg His Asp Ala His Leu Asn Gly Lys Val He Gly 
290 295 300 

Thr Leu Lys Asn Ser Leu Phe Phe Ser Ala Gin Pro Phe Glu He Thr 
305 310 315 320 

Ala Ser Thr Asn Asn Glu Gly Asn Leu Lys Val Arg Phe Pro Leu Arg 

325 330 335 

Leu Thr Gly Lys He Asp Phe Leu Asn Asn Tyr Ala Leu Phe Leu Ser 

340 345 350 

Pro Ser Ala Gin Gin Ala Ser Trp Gin Val Ser Ala Arg Phe Asn Gin 
355 360 365 

Tyr Lys Tyr Asn Gin Asn Phe Ser Ala Gly Asn Asn Glu Asn He Met 
370 375 380 

Glu Ala His Val Gly He Asn Gly Glu Ala Asn Leu Asp Phe Leu Asn 
385 390 395 400 

He Pro Leu Thr He Pro Glu Met Arg Leu Pro Tyr Thr He He Thr 

405 410 415 

Thr Pro Pro Leu Lys Asp Phe Ser Leu Trp Glu Lys Thr Gly Leu Lys 

420 425 430 

Glu Phe Leu Lys Thr Thr Lys Gin Ser Phe Asp Leu Ser Val Lys Ala 
435 440 445 
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Gin Tyr Lys Lys Asn Lys His Arg His Ser lie Thr Asn Pro Leu Ala 
450 455 460 

Val Leu Cys Glu Phe He Ser Gin Ser He Lys Ser Phe Asp Arg His 
465 470 475 480 

Phe Glu Lys Asn Arg Asn Asn Ala Leu Asp Phe Val Thr Lys Ser Tyr 

485 490 495 

Asn Glu Thr Lys He Lys Phe Asp Lys Tyr Lys Ala Glu Lys Ser Gin 

500 505 510 

Asp Glu Leu Pro Arg Thr Phe Gin He Pro Gly Tyr Thr Val Pro Val 
515 520 525 

Val Asn Val Glu Val Ser Pro Phe Thr He Glu Met Ser Ala Phe Gly 
530 535 540 

Tyr Val Phe Pro Lys Ala Val Ser Met Pro Ser Phe Ser He Leu Gly 
545 550 555 560 

Ser Asp Val Arg Val Pro Ser Tyr Thr Leu He Leu Pro Ser Leu Glu 

565 570 575 

Leu Pro Val Leu His Val Pro Arg Asn Leu Lys Leu Ser Leu Pro His 

580 585 590 

Phe Lys Glu Leu Cys Thr He Ser His He Phe He Pro Ala Met Gly 
595 600 605 

Asn He Thr Tyr Asp Phe Ser Phe Lys Ser Ser Val He Thr Leu Asn 
610 615 620 

Thr Asn Ala Glu Leu Phe Asn Gin Ser Asp He Val Ala His Leu Leu 
625 630 635 640 

Ser Ser Ser Ser Ser Val He Asp Ala Leu Gin Tyr Lys Leu Glu Gly 

645 650 655 

Thr Thr Arg Leu Thr Arg Lys Arg Gly Leu Lys Leu Ala Thr Ala Leu 

660 665 670 

Ser Leu Ser Asn Lys Phe Val Glu Gly Ser His Asn Ser Thr Val Ser 
675 680 685 

Leu Thr Thr Lys Asn Met Glu Val Ser Val Ala Lys Thr Thr Lys Ala 
690 695 700 

Glu He Pro He Leu Arg Met Asn Phe Lys Gin Glu Leu Asn Gly Asn 
705 710 715 720 

Thr Lys Ser Lys Pro Thr Val Ser Ser Ser Met Glu Phe Lys Tyr Asp 

725 730 735 
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Phe Asn Ser Ser Met Leu Tyr Ser Thr Ala Lys Gly Ala Val Asp His 

740 745 750 

Lys Leu Ser Leu Glu Ser Leu Thr Ser Tyr Phe Ser lie Glu Ser Ser 
755 760 765 

Thr Lys Gly Asp Val Lys Gly Ser Val Leu Ser Arg Glu Tyr Ser Gly 

770 775 780 

Thr lie Ala Ser Glu Ala Asn Thr Tyr Leu Asn Ser Lys Ser Thr Arg 
785 790 795 800 

Ser Ser Val Lys Leu Gin Gly Thr Ser Lys lie Asp Asp He Trp Asn 

805 810 815 

Leu Glu Val Lys Glu Asn Phe Ala Gly Glu Ala Thr Leu Gin Arg He 

820 825 830 

Tyr Ser Leu Trp Glu His Ser Thr 
835 840 



(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 774 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 

Glu Phe Gin Leu Pro Arg Leu Ser His Thr He Glu He Pro Ala Phe 
15 10 iS 

Gly Arg Leu His Gly He Leu Lys He Gin Ser Pro Leu Phe He Leu 

20 25 30 

Asp Ala Asn Ala Asn He Gin Asn Val Thr Thr Leu Glu Asn Lys Ala 
35 40 45 

Glu He Val Ala Ser He Ala Ala Thr Gly Glu Ser Glu He Glu Ala 
50 55 60 

Leu Asn Phe Asp Phe Gin Ala Gin Ala Gin Phe Leu Glu Leu Asn Pro 
65 70 75 80 

Asn Pro Leu He Leu Lys Glu Ser Met Asn Phe Ser Ser Lys His Ala 

85 90 95 

Arg Met Glu His Glu Gly Glu He Leu Phe Ser Gly Lys Phe He Glu 

100 105 HO 

Gly Lys Leu Asp Thr Val Ala Ser Leu Gin Thr Glu Lys Asn Met Val 
115 120 125 
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Glu Phe Asn Asn 
130 

Asp Ser His Thr 
145 

Phe Ser Ser Lys 



Ala Gly His Val 

180 

Ala Cys Pro Asn 
195 

Phe Thr Val Glu 
210 

Asn Giy Lys His 
225 

Phe Leu Asn Tyr 



His Val Gly Ser 

260 

Glu Ala Lys Ala 
275 

Lys Val lie Gly 
290 

Phe Met lie Thr 
305 

Phe Pro Leu Lys 



Leu Phe Leu Ser 

340 

Arg Phe Asn Gin 
355 

Glu His Asn lie 
370 

Asp Phe Leu Thr 

385 

lie Gly Leu Thr 



Thr Gly Leu Lys 

420 



Gly Met He Val 
135 

Lys Tyr Phe His 
150 

Ala Ser Phe Asn 
165 

Ala Trp Thr Ser 



Phe Ser Asp Glu 

200 

Gly Pro He Ala 
215 

Leu Arg Val He 
230 

Ser Met Leu Glu 
245 

Ser He Leu Thr 



Glu Met Thr Gly 

280 

Thr Leu Lys Asn 
295 

Ala Ser Thr Asn 
310 

Leu Thr Gly Lys 

325 

Pro His Ala Gin 



Tyr Lys Tyr Asn 

360 

Glu Ala His Val 
375 

He Pro Leu Thr 

390 

Thr Pro Leu Leu 
405 

Lys Gin Ser Phe 
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Lys He Asn Asn 

140 

Lys Leu Ser He 
155 

Asn Glu He Lys 
170 

Ser Gly Thr Gly 
185 

Gly Thr His Ser 



Phe Phe Gly Leu 

220 

Gin Lys Leu Ala 
235 

Val Glu Ser Lys 
250 

Gly Lys Gly Thr 
265 

Glu His Asn Ala 



Ser Leu Ser Phe 

300 

Asn Asp Gly Asn 
315 

He Asp Phe Leu 

330 

Gin Ala Ser Trp 
345 

Gin Asn Phe Ser 



Gly Met Asn Gly 

380 

He Pro Glu Val 

395 

Lys Asp Phe Ser 
410 

Asp Leu Ser Val 
425 
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Pro He He Leu 



Pro Arg Leu Asp 

160 

Met Leu Leu Glu 
175 

Ser Trp Asn Trp 

190 

Ser Lvs He Ser 
205 

Ser Asn Asn He 



Tyr Glu Ser Gly 

240 

Val Glu Ser Gin 
255 

Val Leu Leu Arg 
270 

Asp Leu Asn Gly 
285 

Ser Ala Gin Pro 



Leu Lys Val Ser 

320 

Asn Asn Tyr Ala 
335 

Gin Val Ser Ala 
350 

Ala He Asn Asn 
365 

Asp Ala Asn Leu 



Lys Leu Pro Tyr 

400 

He Trp Glu Glu 
415 

Lys Ala Gin Tyr 
430 
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Lys Lys Asn Arg 
435 

Tyr Glu Phe lie 
450 

Lys Val Arg Asp 
465 

Ala Lys Asn Lys 



Gin Lys Arg Gly 

500 

Phe Thr Val Glu 
515 

Asn Thr Pro Ser 
530 

Tyr Arg Leu Val 
545 

Arg Asn Leu Leu 



lie Asp Asn lie 

580 

Ser Phe Lys Ser 
595 

Asn Arg Ser Asp 
610 

Thr Asp Ala Leu 
625 

Lys Arg Gly Leu 



Val Lys Gly Asn 

660 

Glu Ala Ser Val 
675 

Met Asn Phe Lys 
690 

Val Ser Ser Ser 

705 



Asp Arg His Ser 

440 

Leu Asn Asn Val 
455 

Ser Ala Leu Asp 
470 

Phe Glu Asn Ser 
485 

Tyr Thr lie Pro 



Thr Leu Ala Ser 

520 

Val His lie Leu 
535 

Leu Pro Ser Leu 
550 

Lys Phe Ser Leu 
565 

Tyr lie Pro Ala 



Ser Val He Thr 

600 

He Val Ala His 
615 

Gin Tyr Lys Leu 
630 

Lys Leu Ala Thr 
645 

His Asp Ser Thr 



Lys Thr Thr Ala 

680 

Gin Glu Leu Asn 
695 

He Glu Leu Asn 
710 



He Ala lie Pro 



Asp Ser Gly He 

460 

Tyr Leu He Ser 
475 

Leu He Gin Pro 
490 

Phe Val Asn He 
505 

Ser His Val He 



Gly Pro Asn Val 

540 

Glu Leu Pro Val 

555 

Pro Asp Phe Lys 
570 

Leu Gly Asn Phe 
585 

Leu Asn Thr Asn 



Phe Leu Ser Ser 

620 

Glu Gly Thr Ser 
635 

Ala Asp Ser Leu 
650 

Phe Ser Leu Thr 
665 

Asn Leu His Ala 



Gly Asn Ala Lys 

700 

Tyr Asp Phe Asn 
715 



Leu Asn Gly Phe 

445 

Gly Lys He Gly 



Ser Tyr Asn Glu 

480 

Ser Arg Thr Phe 
495 

Glu Val Thr Pro 
510 

Pro Lys Ala He 
525 

He Val Pro Ser 



Leu Arg Val Pro 

560 

Glu Leu Arg Thr 
575 

Thr Tyr Asp Phe 
590 

Val Gly Leu Tyr 

605 

Ser Ser Phe val 



Arg Leu Thr Arg 

640 

Thr Asn Lys Phe 
655 

Lys Lys Asn Met 
670 

Pro He Leu Thr 
685 

Ser Lys Pro He 



Ser Ser Lys Leu 

720 
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Tyr Ser Thr Ala Lys 

725 

Leu Thr Ser Tyr Phe 

740 

Gly Ser Val Leu Ser 

755 

Asn Thr Tyr Leu Asn 

770 
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Gly Gly Val Asp His Lys 

730 

Ser lie Glu Ser Ser Thr 

745 

Gin Glu Tyr Ser Gly Ser 
760 

Ser 
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Phe Ser Leu Glu Ser 

735 

Lys Gly Asn He Lys 
750 

Val Ala Ser Glu Ala 
765 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 785 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 

Glu Phe Gin Leu Pro His Leu Ser His Thr He Glu He Pro Ala Phe 
15 10 15 

Gly Lys Leu His Ser He Leu Lys He Gin Ser Pro Leu Phe He Leu 

20 25 30 

Asp Ala Asn Ala Asn He Gin Asn Val Thr Thr Ser Gly Asn Lys Ala 

35 40 45 

Glu He Val Ala Ser Val Thr Ala Lys Gly Glu Ser Gin Phe Glu Ala 
50 55 60 

Leu Asn Phe Asp Phe Gin Ala Gin Ala Gin Phe Leu Glu Leu Asn Pro 
65 70 75 80 

His Pro Pro Val Leu Lys Glu Ser Met Asn Phe Ser Ser Lys His Val 

85 90 95 

Arg Met Glu His Glu Gly Glu He Val Phe Asp Gly Lys Ala He Glu 

100 105 110 

Gly Lys Ser Asp Thr Val Ala Ser Leu His Thr Glu Lys Asn Glu Val 
115 X20 125 

Glu Phe Asn Asn Gly Met Thr Val Lys Val Asn Asn Gin Leu Thr Leu 
130 135 140 

Asp Ser His Thr Lys Tyr Phe His Lys Leu Ser Val Pro Arg Leu Asp 
145 150 155 160 

Phe Ser Ser Lys Ala Ser Leu Asn Asn Glu He Lys Thr Leu Leu Glu 

165 170 175 
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Ala Gly His Val Ala Leu Thr Ser Ser Gly Thr Gly Ser Trp Asn Trp 

180 185 190 

Ala Cys Pro Asn Phe Ser Asp Glu Gly He His Ser Ser Gin He Ser 
195 200 205 

Phe Thr Val Asp Gly Pro He Ala Phe Val Gly Leu Ser Asn Asn He 
210 215 220 

Asn Gly Lys His Leu Arg Val He Gin Lys Leu Thr Tyr Glu Ser Gly 
225 230 235 240 

Phe Leu Asn Tyr Ser Lys Phe Glu Val Glu Ser Lys Val Glu Ser Gin 

245 250 255 

His Val Gly Ser Ser He Leu Thr Ala Asn Gly Arg Ala Leu Leu Lys 

260 265 270 

Asp Ala Lys Ala Glu Met Thr Gly Glu His Asn Ala Asn Leu Asn Gly 
275 280 285 

Lys Val He Gly Thr Leu Lys Asn Ser Leu Phe Phe Ser Ala Gin Pro 
290 295 300 

Phe Glu He Thr Ala Ser Thr Asn Asn Glu Gly Asn Leu Lys Val Gly 
305 310 315 320 

Phe Pro Leu Lys Leu Thr Gly Lys He Asp Phe Leu Asn Asn Tyr Ala 

325 330 335 

Leu Phe Leu Ser Pro Arg Ala Gin Gin Ala Ser Trp Gin Ala Ser Thr 

340 345 350 

Arg Phe Asn Gin Tyr Lys Tyr Asn Gin Asn Phe Ser Ala He Asn Asn 
355 360 365 

Glu His Asn He Glu Ala Ser He Gly ^4et Asn Gly Asp Ala Asn Leu 
370 375 380 

Asp Phe Leu Asn He Pro Leu Thr He Pro Glu He Asn Leu Pro Tyr 
385 390 395 400 

Thr Glu Phe Lys Thr Pro Leu Leu Lys Asp Phe Ser He Trp Glu Glu 

405 410 415 

Thr Gly Leu Lys Glu Phe Leu Lys Thr Thr Lys Gin Ser Phe Asp Leu 

420 425 430 

Ser Val Lys Ala Gin Tyr Lys Lys Asn Ser Asp Lys His Ser He Val 

435 440 445 

Val Pro Leu Gly Met Phe Tyr Glu Phe He Leu Asn Asn Val Asn Ser 
450 455 460 

Trp Asp Arg Lys Phe Glu Lys Val Arg Asn Asn Ala Leu His Phe Leu 
465 470 475 480 
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Thr Thr Ser Tyr Asn Glu Ala Lys lie Lys Val Asp Lys Tyr Lys Thr 

485 490 495 

Glu Asn Ser Leu Asn Gin Pro Ser Gly Thr Phe Gin Asn His Gly Tyr 

500 505 510 

Thr He Pro Val Val Asn He Glu Val Ser Pro Phe Ala Val Glu Thr 
515 520 525 

Leu Ala Ser Arg His Val He Pro Thr Ala He Ser Thr Pro Ser Val 

530 535 540 

Thr He Pro Gly Pro Asn He Met Val Pro Ser Tyr Lys Leu Val Leu 
545 550 555 560 

Pro Pro Leu Glu Leu Pro Val Phe His Gly Pro Gly Asn Leu Phe Lys 

565 570 575 

Phe Phe Leu Pro Asp Phe Lys Gly Phe Asn Thr He Asp Asn He Tyr 

580 585 590 

He Pro Ala Met Gly Asn Phe Thr Tyr Asp Phe Ser Phe Lys Ser Ser 
595 600 605 

Val He Thr Leu Asn Thr Asn Ala Gly Leu Tyr Asn Gin Ser Asp He 
610 615 620 

Val Ala His Phe Leu Ser Ser Ser Ser Phe Val Thr Asp Ala Leu Gin 
625 630 635 640 

Tyr Lys Leu Glu Gly Thr Ser Arg Leu Met Arg Lys Arg Gly Leu Lys 

645 650 655 

Leu Ala Thr Ala Val Ser Leu Thr Asn Lys Phe Val Lys Gly Ser His 

660 665 670 

Asp Ser Thr He Ser Leu Thr Lys Lys Asn Met Glu Ala Ser Val Arg 
675 680 685 

Thr Thr Ala Asn Leu His Ala Pro He Phe Ser Met Asn Phe Lys Gin 
690 695 700 

Glu Leu Asn Gly Asn Thr Lys Ser Lys Pro Thr Val Ser Ser Ser He 
705 710 715 720 

Glu Leu Asn Tyr Asp Phe Asn Ser Ser Lys Leu His Ser Thr Ala Thr 

725 730 735 

Gly Gly He Asp His Lys Phe Ser Leu Glu Ser Leu Thr Ser Tyr Phe 

740 745 750 

Ser He Glu Ser Phe Thr Lys Gly Asn He Lys Ser Ser Phe Leu Ser 

755 760 765 
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Gin Glu Tyr Ser Gly Ser Val Ala Asn Glu Ala Asn Val Tyr Leu Asn 

770 775 780 

Ser 
785 



(2) INFORMATION FOR SEQ ID NO: 217: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1056 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 

Glu Tyr Ser Gly Thr He Ala Ser Glu Ala Asn Thr Tyr Leu Asn Ser 
1 5 10 15 

Lys Ser Thr Arg Ser Ser Val Lys Leu Gin Gly Thr Ser Lys He Asp 

20 25 30 

Asp He Trp Asn Leu Glu Val Lys Glu Asn Phe Ala Gly Glu Ala Thr 
35 40 45 

Leu Gin Arg He Tyr Ser Leu Trp Glu His Ser Thr Lys Asn His Leu 
50 55 60 

Gin Leu Glu Gly Leu Phe Phe Thr Asn Gly Glu His Thr Ser Lys Ala 
65 70 75 80 

Thr Leu Glu Leu Ser Pro Trp Gin Met Ser Ala Leu Val Gin Val His 

8 5 90 95 

Ala Ser Gin Pro Ser Ser Phe His Asp Phe Pro Asp Leu Gly Gin Glu 

100 105 110 

Val Ala Leu Asn Ala Asn Thr Lys Asn Gin Lys He Arg Trp Lys Asn 
115 120 125 

Glu Val Arg He His Ser Gly Ser Phe Gin Ser Gin Val Glu Leu Ser 
130 135 140 

Asn Asp Gin Glu Lys Ala His Leu Asp He Ala Gly Ser Leu Glu Gly 
145 150 155 160 

His Leu Arg Phe Leu Lys Asn He He Leu Pro Val Tyr Asp Lys Ser 

165 170 175 

Leu Trp Asp Phe Leu Lys Leu Asp Val Thr Thr Ser He Gly Arg Arg 

180 185 190 

Gin His Leu Arg Val Ser Thr Ala Phe Val Tyr Thr Lys Asn Pro Asn 
195 200 205 
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GXy Tyr Ser Phe Ser lie Pro Val Lys Val Leu Ala Asp Lys Phe He 
210 215 220 

Thr Pro Gly Leu Lys Leu Asn Asp Leu Asn Ser Val Leu Val Met Pro 
225 230 235 240 

Thr Phe His Val Pro Phe Thr Asp Leu Gin Val Pro Ser Cys Lys Leu 

245 250 255 

Asp Phe Arg Glu He Gin He Tyr Lys Lys Leu Arg Thr Ser Ser Phe 

260 265 270 

Ala Leu Asn Leu Pro Thr Leu Pro Glu Val Lys Phe Pro Glu Val Asp 
275 280 285 

Val Leu Thr Lys Tyr Ser Gin Pro Glu Asp Ser Leu He Pro Phe Phe 
290 295 300 

Glu He Thr Val Pro Glu Ser Gin Leu Thr Val Ser Arg Phe Thr Leu 
305 310 315 320 

Pro Lys Ser Val Ser Asp Gly He Ala Ala Leu Asp Leu Asn Ala Val 

325 330 335 

Ala Asn Lys He Ala Asp Phe Glu Leu Pro Thr He He Val Pro Glu 

340 345 350 

Gin Thr He Glu He Pro Ser He Lys Phe Ser Val Pro Ala Gly He 
355 360 365 

Val He Pro Ser Phe Gin Ala Leu Thr Ala Arg Phe Glu Val Asp Ser 
370 375 380 

Pro Val Tyr Asn Ala Thr Trp Ser Ala Ser Leu Lys Asn Lys Ala Asp 
385 390 395 400 

Tyr Val Glu Thr Val Leu Asp Ser Thr Cys Ser Ser Thr Val Gin Phe 

405 410 415 

Leu Glu Tyr Glu Leu Asn Val Leu Gly Thr His Lys He Glu Asp Gly 

420 425 430 

Thr Leu Ala Ser Lys Thr Lys Gly Thr Leu Ala His Arg Asp Phe Ser 
435 440 445 

Ala Glu Tyr Glu Glu Asp Gly Lys Phe Glu Gly Leu Gin Glu Trp Glu 
450 455 460 

Gly Lys Ala His Leu Asn He Lys Ser Pro Ala Phe Thr Asp Leu His 
465 470 475 480 

Leu Arg Tyr Gin Lys Asp Lys Lys Gly He Ser Thr Ser Ala Ala Ser 

485 490 495 

Pro Ala Val Gly Thr Val Gly Met Asp Met Asp Glu Asp Asp Asp Phe 

500 505 510 
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Ser Lys Trp Asn Phe Tyr Tyr Ser Pro Gin Ser Ser Pro Asp Lys Lys 
515 520 525 

Leu Thr lie Phe Lys Thr Glu Leu Arg Val Arg Glu Ser Asp Glu Glu 
530 ' 535 540 

Thr Gin He Lys Val Asn Trp Glu Glu Glu Ala Ala Ser Gly Leu Leu 
545 550 555 560 

Thr ser Leu Lys Asp Asn Val Pro Lys Ala Thr Gly Val Leu Tyr Asp 

565 570 575 

Tyr val Asn Lys Tyr His Trp Glu His Thr Gly Leu Thr Leu Arg Glu 

580 595 590 

Val Ser Ser Lys Leu Arg Arg Asn Leu Gin Asn Asn Ala Glu Trp Val 
595 600 605 

Tyr Gin Gly Ala He Arg Gin He Asp Asp He Asp Val Arg Phe Gin 
610 615 620 

Lys Ala Ala Ser Gly Thr Thr Gly Thr Tyr Gin Glu Trp Lys Asp Lys 
625 630 635 640 

Ala Gin Asn Leu Tyr Gin Glu Leu Leu Thr Gin Glu Gly Gin Ala Ser 

645 650 655 

Phe Gin Gly Leu Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr 

660 665 670 

Gin Lys Phe His Met Lys Val Lys His Leu He Asp Ser Leu He Asp 
675 680 685 

Phe Leu Asn Phe Pro Arg Phe Gin Phe Pro Gly Lys Pro Gly He Tyr 
690 695 700 

Thr Arg Glu Glu Leu Cys Thr Met Phe He Arg Glu Val Gly Thr Val 
705 710 715 720 

Leu Ser Gin Val Tyr Ser Lys Val His Asn Gly Ser Glu He Leu Phe 

725 730 735 

Ser Tyr Phe Gin Asp Leu Val He Thr Leu Pro Phe Glu Leu Arg Lys 

740 745 750 

His Lys Leu He Asp Val He Ser Met Tyr Arg Glu Leu Leu Lys Asp 
755 760 765 

Leu Ser Lys Glu Ala Gin Glu Val Phe Lys Ala He Gin Ser Leu Lys 
770 775 780 

Thr Thr Glu Val Leu Arg Asn Leu Gin Asp Leu Leu Gin Phe He Phe 

785 790 795 800 



SUBSTITUTE SHEET (RULE 26) 



wo 98/56938 PCT/US98/1 1927 

-209- 

Gin Leu lie Glu Asp Asn lie Lys Gin Leu Lys Glu Met Lys Phe Thr 

805 810 815 

Tyr Leu lie Asn Tyr He Gin Asp Glu He Asn Thr He Phe Asn Asp 

820 825 830 

Tyr He Pro Tyr Val Phe Lys Leu Leu Lys Glu Asn Leu Cys Leu Asn 
835 840 845 

Leu His Lys Phe Asn Glu Phe He Gin Asn Glu Leu Gin Glu Ala Ser 
850 855 860 

Gin Glu Leu Gin Gin He His Gin Tyr He Met Ala Leu Arg Glu Glu 
865 870 875 880 

Tyr Phe Asp Pro Ser He Val Gly Trp Thr Val Lys Tyr Tyr Glu Leu 

885 890 895 

Glu Glu Lys He Val Ser Leu He Lys Asn Leu Leu Val Ala Leu Lys 

900 905 910 

Asp Phe His Ser Glu Tyr He Val Ser Ala Ser Asn Phe Thr Ser Gin 
915 920 925 

Leu Ser Ser Gin Val Glu Gin Phe Leu His Arg Asn He Gin Glu Tyr 
930 935 940 

Leu Ser He Leu Thr Asp Pro Asp Gly Lys Gly Lys Glu Lys He Ala 
945 950 955 960 

Glu Leu Ser Ala Thr Ala Gin Glu He He Lys Ser Gin Ala He Ala 

965 970 975 

Thr Lys Lys He He Ser Asp Tyr His Gin Gin Phe Arg T/r Lys Leu 

980 985 990 

Gin Asp Phe Ser Asp Gin Leu Ser Asp Tyr Tyr Glu Lys Phe He Ala 
995 1000 1005 

Glu Ser Lys Arg Leu He Asp Leu Ser He Gin Asn Tyr His Thr Phe 
1010 1015 1020 

Leu He Tyr He Thr Glu Leu Leu Lys Lys Leu Gin Ser Thr Thr Val 
1025 1030 1035 1040 

Met Asn Pro Tyr Met Lys Leu Ala Pro Gly Glu Leu Thr He He Leu 

1045 1050 1055 



(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 989 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS ; 

(D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 

Asn Ser Lys Gly Thr Arg Ser Ser Val Arg Leu Gin Gly Ala Ser Asn 
1 5 10 15 

Phe Ala Gly He Trp Asn Phe Glu Val Gly Glu Asn Phe Ala Gly Glu 

20 25 30 

Ala Thr Leu Arg Arg He Tyr Gly Thr Trp Glu His Asn Met He Asn 
35 40 45 

His Leu Gin Val Phe Ser Tyr Phe Asp Thr Lys Gly Lys Gin Thr Cys 
50 55 60 

Arg Ala Thr Leu Glu Leu Ser Pro Trp Thr Met Ser Thr Leu Leu Gin 
65 70 75 80 

Val His Val Ser Gin Pro Ser Pro Leu Phe Asp Leu His His Phe Asp 

85 90 95 

Gin Glu Val He Leu Lys Ala Ser Thr Lys Asn Gin Lys Val Ser Trp 

100 105 HO 

Lys Ser Glu Val Gin Val Glu Ser Gin Val Leu Gin His Asn Ala His 
115 120 125 

Phe Ser Asn Asp Gin Glu Glu Val Arg Leu Asp He Ala Gly Ser Leu 
130 135 140 

Glu Gly Gin Leu. Trp Asp Leu Glu Asn Phe Phe Leu Pro Ala Phe Gly 
145 150 155 160 

Lys Ser Leu Arg Glu Leu Leu Gin He Asp Gly Lys Arg Gin Tyr Leu 

165 170 175 

Gin Ala Ser Thr Ser Leu His Tyr Thr Lys Asn Pro Asn Gly Tyr Leu 

180 185 190 

Leu Ser Leu Pro Val Gin Glu Leu Thr Asp Arg Phe He He Pro Gly 
195 200 205 

Leu Lys Leu Asn Asp Phe Ser Gly He Lys He Tyr Lys Lys Leu Ser 
210 215 220 

Thr Ser Pro Phe Ala Leu Asn Leu Thr Met Leu Pro Lys Val Lys Phe 
225 230 235 240 

Pro Gly Val Asp Leu Leu Thr Gin Tyr Ser Lys Pro Glu Gly Ser Ser 

245 250 255 

Val Pro Thr Phe Glu Thr Thr He Pro Glu He Gin Leu Thr Val Ser 

260 265 270 

Gin Phe Thr Leu Pro Lys Ser Phe Pro Val Gly Asn Thr Val Phe Asp 
275 280 285 
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Leu Asn Lys Leu 
290 

Thr Leu Pro Glu 
305 

Pro Ala Gly lie 



Gly Met Ala Ser 

340 

Asn Lys Ala Asp 

355 

Thr Leu Gin Phe 
370 

lie Glu Asn Asp 
385 

Cys Asp Phe Asn 



Trp Asp Leu Glu 

420 

Thr Asp Phe His 
435 

Ser Ala Ala Ser 
450 

Asp Asp Gin Ser 
465 

Pro Asp Asn Lys 



Ser Asp Gly Glu 

500 

Phe Arg Leu Leu 
515 

Ala Val Tyr Asp 
530 

Glu Leu Arg Lys 
545 

Val Asp Glu Met 



Thr Asn Leu lie 
295 

Gin Thr He Glu 
310 

Phe He Pro Phe 
325 

Pro Leu Tyr Asn 



His Val Glu Thr 

360 

Leu Glu Tyr Ala 
375 

Lys Phe He Tyr 
390 

Val Lys Tyr Asn 
405 

Gly Glu Ala His 



Leu His Tyr Lys 

440 

Pro Ala He Gly 
455 

Val Arg Leu His 
470 

Leu Ser He Phe 
485 

Thr Tyr He Lys 



Asp Ser Leu Lys 

520 

Tyr Val Lys Lys 
535 

Ser Leu Gin Asn 
550 

Asn Val Asn Ala 
565 



Ala Asp Val Asp 

300 

He Pro Ser Leu 
315 

Phe Gly Glu Leu 
330 

Val Thr Trp Ser 
345 

Phe Leu Asp Ser 



Leu Lys Val Val 

380 

Lys He Lys Gly 
395 

Glu Asp Gly He 
410 

Leu Asp He Thr 

425 

Glu Asp Lys Thr 



Thr Val Ser Leu 

460 

Val Tyr Phe Arg 
475 

Lys Met Glu Trp 
490 

He Asn Trp Glu 
505 

Ser Asn Val Pro 



Tyr His Leu Gly 

540 

Asp Ala Glu His 
555 

Gin Arg Val Thr 
570 



Leu Pro Ser He 



Glu Phe Ser Val 

320 

Thr Ala His Val 
335 

Thr Gly Trp Lys 
350 

Thr Cys Ser Ser 
365 

Gly Thr His Arg 



Thr Leu Gin His 

400 

Phe Glu Gly Leu 
415 

Ser Pro Ala Leu 
430 

Ser Val Ser Ala 
445 

Asp Ala Ser Thr 



Pro Gin Ser Pro 

480 

Arg Asp Lys Glu 
495 

Glu Glu Ala Ala 
510 

Lys Ala Ser Glu 
525 

His Ala Ser Ser 



TQa He Arg Met 

560 

Arg Asp Thr Tyr 
575 
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Gin Ser Leu Tyr 

580 

Glu Lys Leu Lys 
595 

Lys Tyr His Met 
610 

Leu Lys Phe Asn 
625 

Val Asp Glu Leu 



Ser Gin Leu Phe 

660 

Gin Val Glu Lys 
675 

Phe Ser Pro Thr 
690 

Glu Asp Leu Asn 
705 

Thr Thr lie Leu 



lie He Glu Glu 

740 

Val Pro Asp His 

755 

Phe Lys Ser Leu 
770 

Asp Phe Val Gin 
785 

Val His Gin Tyr 



Val Val Gly Trp 

820 

Asp Leu He Lys 
835 

Tyr Ser Val Thr 
850 

Glu Gin Phe Val 
865 



Lys Lys Met Leu 



Lys Met Val Leu 

600 

Ala Val Thr Trp 
615 

Arg Val Gin Phe 
630 

Tyr Thr He Ala 
645 

Asn Gly Leu Gly 



Ser Arg Val He 

680 

Pro Cys Lys Leu 

695 

He Leu Ser Asn 
710 

Ser Asp Phe Gin 
725 

Lys He Glu Cys 



He Asn Met Phe 

760 

Arg Glu Asn He 
775 

Ser He Leu Gin 
790 

Met Lys Ala Phe 
805 

Thr Val Lys Tyr 



Thr Leu Leu Ala 

840 

Ala Ala Asp Phe 
855 

Ser Arg Asp He 
870 

SUBSTITUTE 



Ala Gin Glu Ser 
585 

Gly Ser Leu Val 



Leu Met Asp Ser 

620 

Pro Gly Asn Ala 
635 

Met Arg Glu Thr 
650 

His Leu Phe Ser 
665 

Asn Asp He Thr 



Lys Asp Val Leu 

700 

Leu Gly Gin Gin 
715 

Ser Phe Leu Glu 
730 

Leu Lys Asn Asn 
745 

Phe Lys Thr His 



Tyr Ser Val Phe 

780 

Glu Gly Ser Tyr 
795 

Arg Glu Glu Tyr 
8X0 

Tyr Glu He Glu 
825 

Pro Leu Arg Asp 



Ala Ser Lys Met 

860 

Arg Glu Tyr Leu 
875 

SHEET (RULE 26) 



Gin Ser He Pro 
590 

Arg He Thr Gin 
605 

Val He His Phe 



Gly Thr Tyr Thr 

640 

Lys Lys Leu Leu 
655 

Tyr Val Gin Asp 
670 

Phe Lys Cys Pro 
685 

Leu He Phe Arg 



Asp He Asn Phe 

720 

Arg Leu Leu Asp 
735 

Glu Ser Thr Cys 
750 

He Pro Phe Ala 

765 

Ser Glu Phe Asn 



Lys Leu Gin Gin 

800 

Phe Asp Pro Ser 
815 

Glu Lys Met Val 
830 

Phe Tyr Ser Glu 
845 

Ser Thr Gin Val 



Ser Met Leu Ala 

880 



wo 98/56938 



-213- 



PCrAJS98/11927 



Asp He Asn Gly Lys Gly Arg Glu Lys Val Ala Glu Leu Ser He Val 

885 890 895 

Val Lys Glu Arg He Lys Ser Trp Ser Thr Ala Val Ala Glu He Thr 

900 905 910 

Ser Asp Tyr Leu Arg Gin Leu His Ser Lys Leu Gin Asp Phe Ser Asp 
915 920 925 

Gin Leu Ser Gly Tyr Tyr Glu Lys Phe Val Ala Glu Ser Thr Arg Leu 
930 935 940 

He Asp Leu Ser He Gin Asn Tyr His Met Phe Leu Arg Tyr He Ala 
945 950 955 960 

Glu Leu Leu Lys Lys Leu Gin Val Ala Thr Ala Asn Asn Val Ser Pro 

965 970 975 

Tyr Leu Arg Phe Ala Gin Gly Glu Leu He He Thr Phe 

980 985 



(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 

Lys Asp Asn Val Phe Asp Gly Leu Val Arg Val Thr Gin Lys Phe His 
15 10 15 

Met Lys Val Lys His Leu He Asp Ser Leu He Asp Phe Leu Asn Phe 

20 25 30 

Pro Arg Phe Gin Phe Pro Gly Lys Pro Gly He Tyr Thr Arg Glu Glu 
35 40 45 

Leu Cys Thr Met Phe He Arg Glu Val Gly Thr Val Leu Ser Gin Val 
50 55 60 

Tyr Ser Lys Val His Asn Gly Ser Glu He Leu Phe Ser Tyr Phe Gin 
65 70 75 80 

Asp Leu Val He Thr Leu Pro Phe Glu Leu Arg Lys His Lys Leu He 

85 90 95 

Asp Val He Ser Met Tyr Arg Glu Leu Leu Lys Asp Leu Ser Lys Glu 

100 105 HO 

Ala Gin Glu Val Phe Lys Ala He Gin Ser Leu Lys Thr Thr Glu Val 
115 120 125 
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Leu Arg Asn Leu 
130 

Asp Asn lie Lys 
145 

Tyr lie Gin Asp 



Val Phe Lys Leu 

180 

Asn Glu Phe lie 
195 

Gin lie His Gin 
210 

Ser lie Val Gly 
225 

Val Ser Leu lie 



Glu Tyr lie Val 

260 

Val Glu Gin Phe 
275 

Thr Asp Pro Asp 
290 

Thr Ala Gin Glu 
305 

lie Ser Asp Tyr 



Asp Gin Leu Ser 

340 

Leu lie Asp Leu 
355 

Thr Glu Leu Leu 
370 

Met Lys Leu Ala 
385 



Gin Asp Leu Leu 
135 

Gin Leu Lys Glu 
150 

Glu lie Asn Thr 
165 

Leu Lys Glu Asn 



Gin Asn Glu Leu 

200 

Tyr lie Met Ala 
215 

Trp Thr Val Lys 
230 

Lys Asn Leu Leu 

245 

Ser Ala Ser Asn 



Leu His Arg Asn 

280 

Gly Lys Gly Lys 
295 

lie lie Lys Ser 

310 

His Gin Gin Phe 
325 

Asp Tyr Tyr Glu 



Ser lie Gin Asn 

360 

Lys Lys Leu Gin 
375 

Pro Gly Glu Leu 

390 



Gin Phe lie Phe 

140 

Met Lys Phe Thr 
155 

lie Phe Asn Asp 
170 

Leu Cys Leu Asn 
185 

Gin Glu Ala Ser 



Leu Arg Glu Glu 

220 

Tyr Tyr Glu Leu 
235 

Val Ala Leu Lys 

250 

Phe Thr Ser Gin 
265 

lie Gin Glu Tyr 



Glu Lys lie Ala 

300 

Gin Ala lie Ala 
315 

Arg Tyr Lys Leu 

330 

Lys Phe lie Ala 
345 

Tyr His Thr Phe 



Ser Thr Thr Val 

380 

Thr lie lie Leu 
395 



Gin Leu He Glu 



Tyr Leu He Asn 

160 

Tyr He Pro Tyr 

175 

Leu His Lys Phe 
190 

Gin Glu Leu Gin 
205 

Tyr Phe Asp Pro 



Glu Glu Lys He 

240 

Asp Phe His Ser 
255 

Leu Ser Ser Gin 
270 

Leu Ser He Leu 
285 

Glu Leu Ser Ala 



Thr Lys Lys He 

320 

Gin Asp Phe Ser 
335 

Glu Ser Lys Arg 
350 

Leu He Tyr He 
365 

Met Asn Pro Tyr 



(2) INFORMATION FOR SEQ ID NO : 220: 



(i) SEQUENCE CHARACTERISTICS: 

SUBSTITUTE SHEET (RULE 2B) 



wo 98/56938 PCT/US98/1 1927 

-215- 

(A) LENGTH: 4 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO; 220: 

He Pro Gly Leu Ser Glu Lys Tyr Thr Gly Glu Glu Leu Tyr Leu Met 
15 10 15 

Thr Thr Glu Lys Ala Ala Lys Thr Ala Asp He Cys Leu Ser Lys Leu 

20 25 30 

Gin Glu Tyr Phe Asp Ala Leu He Ala Ala He Ser Glu Leu Glu Val 
35 40 45 

Arg val Pro Ala Ser Glu Thr He Leu Arg Gly Arg Asn Val Leu Asp 
50 55 60 

Gin He Lys Glu Met Leu Lys His Leu Gin Glu Lys He Arg Gin Thr 
65 70 75 80 

Phe Val Thr Leu Gin Glu Ala Asp Phe Ala Gly Lys Leu Asn Arg Leu 

85 90 95 

Lys Gin Val Val Gin Lys Thr Phe Gin Lys Ala Gly Asn Met Val Arg 

100 105 110 

Ser Leu Gin Ser Lys Asn Phe Glu Asp He Lys Val Gin Met Gin Gin 
115 120 125 

Leu Tyr Lys Asp Ala Met Ala Ser Asp Tyr Ala His Lys Leu Arg Ser 
130 135 140 

Leu Ala Glu Asn Val Lys Lys Tyr He Ser Gin He Lys Asn Phe Ser 
145 150 155 160 

Gin Lys Thr Leu Gin Lys Leu Ser Glu Asn Leu Gin Gin Leu Val Leu 

165 170 175 

Tyr He Lys Ala Leu Arg Glu Glu Tyr Phe Asp Pro Thr Thr Leu Gly 

180 185 190 

Trp Ser Val Lys Tyr Tyr Glu Val Glu Asp Lys Val Leu Gly Leu Leu 
195 200 205 

Lys Asn Leu Met Asp Thr Leu Val He Trp Tyr Asn Glu Tyr Ala Lys 
210 215 220 

Asp Leu Ser Asp Leu Val Thr Arg Leu Thr Asp Gin Val Arg Glu Leu 
225 230 235 240 

Val Glu Asn Tyr Arg Gin Glu Tyr Tyr Asp Leu He Thr Asp Val Glu 

245 250 255 
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Gly Lys Gly Arg 

260 

Lys He Arg Tyr 
275 

Asn Arg Gin Val 

290 

Asp Ser Gin Glu 

305 

Thr Val Glu Lys 



Arg Trp Phe Glu 

340 

Val Arg Glu Gly 
355 

He Asn Gin Met 
370 

Glu Leu Thr Arg 
385 

Lys Trp Glu Glu 



Gin Leu Ser Phe 

420 

Thr 



Gin Lys Val Met 



Trp Ser Ala Val 

280 

Lys Ala Lys Leu 
295 

Lys Leu He Asn 
310 

Tyr Ser Thr Phe 
325 

Gin Ala Thr Ala 



Glu Leu Arg lie 

360 

Pro Gin Lys Ser 
375 

Ala Leu He Gin 
390 

Met Gin Ala Phe 
405 

Gin Gin He Val 



216- 

Glu Leu Ser Ser 
265 

Ala Lys Arg Lys 



Gin Glu He Tyr 

300 

Val Ala Lys Met 
315 

Met Lys Tyr He 
330 

Asp Ser He Lys 
345 

Asp Val Pro Phe 



Arg Glu Ala Leu 

380 

Gin Gly Val Glu 
395 

He Asp Glu Gin 
410 

Glu Asn He Gin 
425 



Ala Ala Gin Glu 
270 

He Asn Glu His 

285 

Gly Gin Leu Ser 



Leu He Asp Leu 

320 

Phe Glu Leu Leu 
335 

Pro Tyr He Ala 
350 

Asp Trp Glu Tyr 

365 

Arg Asn Lys Val 



Gin Gly Thr Arg 

400 

Leu Ala Thr Glu 
415 

Lys Arg Met Lys 
430 



(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 180 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 221: 

Asp Met Thr Phe Ser Lys Gin Asn Ala Leu Leu Arg Ser Glu Tyr Gin 
15 10 15 

Ala Asp Tyr Glu Ser Leu Arg Phe Phe Ser Leu Leu Ser Gly Ser Leu 

20 25 30 

Asn Ser His Gly Leu Glu Leu Asn Ala Asp He Leu Gly Thr Asp Lys 
35 40 45 
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He Asn Ser Gly 
50 

He Ser Thr Ser 
65 

Glu Asn Glu Leu 



Leu Thr Thr Asn 

100 

Asp Gly Lys Ala 
115 

Ala Met He Leu 
130 

Ser Gin Glu Gly 
145 

Glu Met Lys Phe 



Ala His Lys Ala 
55 

Ala Thr Thr Asn 

70 

Asn Ala Glu Leu 
85 

Gly Arg Phe Arg 



Ala Leu Thr Glu 

120 

Gly Val Asp Ser 
13 5 

Leu Lys Leu Ser 
150 

Asp His Thr Asn 
165 



Thr Leu Arg He 

60 

Leu Lys Cys Ser 
75 

Gly Leu Ser Gly 
90 

Glu His Asn Ala 
105 

Leu Ser Leu Gly 



Lys Asn He Phe 

140 

Asn Asp Met Met 

155 

Ser Leu Asn He 
170 



Gly Gin Asp Gly 



Leu Leu Val Leu 

80 

Ala Ser Met Lys 
95 

Lys Phe Ser Leu 
110 

Ser Ala Tyr Gin 
125 

Asn Phe Lys Val 



Gly Ser Tyr Ala 

160 

Ala Gly Leu Ser 
175 



Leu Asp Phe Ser 

180 



(2) INFORMATION FOR SEQ ID NO: 222: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 142 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

( D ) TOPOLOGY : 1 inea r 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 

Asp Leu Thr Phe Ser Lys Gin Asn Ala Leu Leu Arg Ala Glu Tyr Gin 
1 5 10 15 

Ala Asp Tyr Lys Ser Leu Arg Phe Phe Thr Leu Leu Ser Gly Leu Leu 

20 25 30 

Asn Thr His Gly Leu Glu Leu Asn Ala Asp He Leu Gly Thr Asp Lys 
35 40 45 

Met Asn Thr Ala Ala His Lys Ala Thr Leu Arg He Gly Gin Asn Gly 
50 55 60 

Val Ser Thr Ser Ala Thr Thr Ser Leu Arg Tyr Ser Pro Leu Met Leu 
65 70 75 80 

Glu Asn Glu Leu Asn Ala Glu Leu Ala Leu Ser Gly Ala Ser Met Lys 

85 90 95 
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Leu Ala Thr Asn Gly Arg Phe Lys Glu His Asn Ala Lys Phe Ser Leu 

100 105 

Asp Gly Lys Ala Thr Leu Thr Glu Leu Ser Leu Gly Ser Ala Tyr Gin 
115 120 125 

Ala Met lie Leu Gly Ala Asp Ser Lys Asn He Phe Asn Phe 
130 135 1*0 



(2) INFORMATION FOR SEQ ID NO; 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 420 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

His He Phe He Pro Ala Met Gly Asn He Thr Tyr Asp Phe Ser Phe 
15 10 15 

Lys Ser Ser Val He Thr Leu Asn Thr Asn Ala Glu Leu Phe Asn Gin 

20 25 30 

Ser Asp He Val Ala His Leu Leu Ser Ser Ser Ser Ser Val He Asp 
35 40 45 

Ala Leu Gin Tyr Lys Leu Glu Gly Thr Thr Arg Leu Thr Arg Lys Arg 
50 55 60 

Gly Leu Lys Leu Ala Thr Ala Leu Ser Leu Ser Asn Lys Phe Val Glu 
65 70 75 80 

Gly Ser His Asn ser Thr Val Ser Leu Thr Thr Lys Asn Met Glu Val 

86 90 95 

Ser val Ala Lys Thr Thr Lys Ala Glu He Pro He Leu Arg Met Asn 

100 105 110 

Phe Lys Gin Glu Leu Asn Gly Asn Thr Lys Ser Lys Pro Thr Val Ser 
115 120 125 

ser Ser Met Glu Phe Lys Tyr Asp Phe Asn Ser Ser Met Leu Tyr Ser 
130 135 140 

Thr Ala Lys Gly Ala Val Asp His Lys Leu Ser Leu Glu Ser Leu Thr 
145 150 155 160 

Ser Tyr Phe Ser He Glu Ser Ser Thr Lys Gly Asp Val Lys Gly Ser 

165 170 175 

Val Leu Ser Arg Glu Tyr Ser Gly Thr He Ala Ser Glu Ala Asn Thr 

180 185 190 
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Tyr Leu Asn Ser 
195 

Ser Lys He Asp 
210 

Gly Glu Ala Thr 
225 

Lys Asn His Leu 



Thr Ser Lys Ala 

260 

Val Gin Val His 
275 

Leu Gly Gin Glu 
290 

Arg Trp Lys Asn 
305 

Val Glu Leu Ser 



Ser Leu Glu Gly 

340 

Tyr Asp Lys Ser 
355 

He Gly Arg Arg 
370 

Lys Asn Pro Asn 
385 

Asp Lys Phe He 



Lys Ser Thr Arg 

200 

Asp He Trp Asn 
215 

Leu Gin Arg He 
230 

Gin Leu Glu Gly 
245 

Thr Leu Glu Leu 



Ala Ser Gin Pro 

280 

Val Ala Leu Asn 
295 

Glu Val Arg He 
310 

Asn Asp Gin Glu 
325 

His Leu Arg Phe 



Leu Trp Asp Phe 

360 

Gin His Leu Arg 
375 

Gly Tyr Ser Phe 
390 

Thr Pro Gly Leu 
405 



219- 

Ser Ser Val Lys 



Leu Glu Val Lys 

220 

Tyr Ser Leu Trp 
235 

Leu Phe Phe Thr 
250 

Ser Pro Trp Gin 
265 

Ser Ser Phe His 



Ala Asn Thr Lys 

300 

His Ser Gly Ser 
315 

Lys Ala His Leu 
330 

Leu Lys Asn He 
345 

Leu Lys Leu Asp 



Val Ser Thr Ala 

380 

Ser He Pro Val 
395 

Lys Leu Asn Asp 
410 



Leu Gin Gly Thr 
205 

Glu Asn Phe Ala 



Glu His Ser Thr 

240 

Asn Gly Glu His 
255 

Met Ser Ala Leu 
270 

Asp Phe Pro Asp 
285 

Asn Gin Lys He 



Phe Gin Ser Gin 

320 

Asp He Ala Gly 
335 

He Leu Pro Val 
350 

Val Thr Thr Ser 
365 

Phe Val Tyr Thr 



Lys Val Leu Ala 

400 

Leu Asn Ser Val 
415 



Leu Val Met Pro 

420 



(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 
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Met Ala Ser Glu Lys Gly Pro Ser Asn Lys Asp Tyr Thr Leu Arg Arg 
1 5 10 15 

Arg lie Glu Pro Trp Glu Phe Glu Val Phe Phe Asp Pro Gin Glu Leu 

20 2& 30 

Arg Lys Glu Ala Cys Leu Leu Tyr Glu He Lys Trp Gly Ala Ser Ser 
35 40 45 

Lys Thr Trp Arg Ser Ser Gly Lys Asn Thr Thr Asn His Val Glu Val 
50 55 60 

Asn Phe Leu Glu Lys Leu Thr Arg Lys Glu Ala Cys Leu Leu Tyr Glu 
65 70 75 80 

He Lys Trp Gly Ala Ser Ser Lys Thr Trp Arg Ser Ser Gly Lys Asn 

85 90 95 

Thr Thr Asn His Val Glu Val Asn Phe Leu Glu Lys Leu Thr Ser Glu 

100 105 110 

Gly Arg Leu Gly Pro Ser Thr Cys Cys Ser He Thr Trp Phe Leu Ser 
115 120 125 

Trp Ser Pro Cys Trp Glu Cys Ser Met Ala He Arg Glu Phe Leu Ser 
130 135 140 

Gin His Pro Gly Val Thr Leu He He Phe Val Ala Arg Leu Phe Gin 
145 150 155 160 

His Met Asp Arg Arg Asn Arg Gin Gly Leu Lys Asp Leu Val Thr Ser 

165 170 175 

Gly Val Thr Val Arg Val Met Ser Val Ser Glu Tyr Cys Tyr Cys Trp 

180 185 190 

Glu Asn Phe Val Asn Tyr Pro Pro Gly Lys Ala Ala Gin Trp Pro Arg 
195 200 205 

Tyr Pro Pro Arg Trp Met Leu Met Tyr Ala Leu Glu Leu Tyr Cys He 
210 215 220 

He Leu Gly Leu Pro Pro Cys Leu Lys He Ser Arg Arg His Gin Lys 
225 230 235 240 

Gin Leu Thr Phe Phe Ser Leu Thr Pro Gin Tyr Cys His Tyr Lys Met 

245 250 255 

He Pro Pro Tyr He Leu Leu Ala Thr Gly Leu Leu Gin Pro Ser Val 

260 265 270 

Pro Trp Arg 
275 
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(2) INFORMATION FOR SEQ ID NO: 225; 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 589 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO ; 225: 



GGATCTGACG 


GTTCACTATVA 






CTCCCACCGT ACACGCCTAC 


60 


CGCCCATTTG 


CGTCAATGGG 


GCGGAGTTGT 


TACGACATTT 


TGGAAAGTCC 


CGTTGATTTT 




GGTGCCAAAA 


CAAACTCCAT 


TGACGTCAAT 


GGGGTGGAGA 


CTTGGAAATC 


CCCGTGAGTC 


180 


AAACCGCTAT 


CCACGCCCAT 


TGATGTACTG 


CCAAAACCGC 


ATCACCATGG 


TAATAGCGAT 


240 


GACTAATACG 


TAGATGTACT 


GCCAAGTAGG 


AAAGTCCCAT 


AAGGTCATGT 


ACTGGGCATA 


300 


ATGCCAGGCG 


GGCCATTTAC 


CGTCATTGAC 


GTCAATAGGG 


GGCGTACTTG 


GCATATGATA 


360 


CACTTGATGT 


ACTGCCAAGT 


GGGCAGTTTA 


CCGTAAATAC 


TCCACCCATT 


GACGTCAATG 


420 


GAAAGTCCCT 


ATTGGCGTTA 


CTATGGGAAC 


ATACGTCATT 


ATTGACGTCA 


ATGGGCGGGG 


480 


GTCGTTGGGC 


GGTCAGCCAG 


GCGGGCCATT 


TACCGTAAGT 


TATGTAACGC 


GGAACTCCAT 


540 


ATATGGGCTA 


TGAACTAATG 


ACCCCGTAAT 


TGATTACTAT 


TAATAACTA 




589 



(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 
GATCCAAATC ACCCACTGCA ACTCCTCCCC CTGCG 



(2) INFORMATION FOR SEQ ID NO: 227: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 
GATCCATCCA ATTGGGCAAT CAGGAG 
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(2) INFORMATION FOR SEQ ID NO : 228: 

(i) SEQX3ENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 
GATCCGGTCT CCAATTGG 



(2) INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 
GATCCTCGGG AAAGGGAAAC CGAAACTGAA GCCG 
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CLAIMS : 

1 . A composition comprising: 

(a) an isolated polypeptide comprising at least one LDL or VLDL nucleic 

acid binding domain; and 

(b) a nucleic acid comprising an LDL or VLDL binding sequence, 
wherein said nucleic acid is bound to said polypeptide. 

2. The composition of claim I , wherein said polypeptide comprises an LDL nucleic acid 
binding domain. 

3. The composition of claim 1, wherein said polypeptide comprises a VLDL nucleic acid 
binding domain. 

4. The composition of claim U wherein said nucleic acid comprises an expression region 
operably linked to a promoter active in eukaryotic cells. 

5. The composition of claim 4, wherein said expression region encodes a polypeptide. 

6. The composition of claim 4, wherein said expression region comprises an antisense 
construct. 

7. The composition of claim 5, wherein said polypeptide is selected from the group 
consisting of a-globin, p-globin, y-globin, granulocyte macrophage-colony stimulating 
factor (GM-CSF), tumor necrosis factor (TNF), IL-2, IL-3, IL-4, IL-5, IL-6, lL-7, IL-8, 
IL-9, IL-10, IL-11, IL.12, IL-13, IL-14, IL-15, p-interferon, y-interferon, cytosine 
deaminase, adenosine deaminase, p-glucuronidase, hypoxanthine guanine 
phosphoribosyl transferase, galactose- 1 -phosphate uridyltransferase, glucocerbrosidase, 
glucose-6-phosphatase, thymidine kinase, lysosomal glucosidase, growth hormone, 
nerve growth factor, insulin, adrenocorticotropic hormone, parathormone, follicle- 
stimulating hormone, luteinizing hormone, epidermal growth factor, thyroid stimulating 
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horaione of CFTR, EGFR, VEGFR, lL-2 receptor, estrogen receptor, Bax, Bak, Bcl-X^, 
Bik, Bid, Bad, Harakiri. Ad El B, an ICE-CED3 protease neomycin resistance, 
luciferase, adenine phosphoribosyl transferase (APRT), retinoblastoma, insulin, mast 
cell growth factor, p53, pl6, p2l, MMACl, p73, zacl and BRCAI. 

8. The composition of claim 6, wherein said antisense construct is complementary to a 
segment of an oncogene. 

9. The composition of claim 8, wherein said oncogene is selected from the group 
consisting of ra^:. myc, neii raf erb, srcfmsjun, irk ret, gsp. hst, bcl and abl 

10. The composition of claim 4. wherein said promoter is selected from the group consisting 
of CMV IE, LTR, SV40 IE. HSV /A, p-actin, human globin a, human globin (i and 
human globin y promoter. 

11. The composition of claim 1 , wherein said nucleic acid binding domain is an apoBlOO 
nucleic acid binding domain. 

1 2. The composition of claim 1 . wherein said composition further comprises one or more 
lipoproteins selected from the group consisting of apoAl, apoA-II, apoA-lV, acat, apoE, 
apoC-Il, apoC-lII and apo-D. 

13. The composition of claim 1 1 . wherein said apoBlOO is selected from the group 
consisting of human, rat and baboon apoBlOO. 

14. The composition of claim 1 , wherein said polypeptide comprises at least two nucleic 
acid binding domains. 

15. The composition of claim 1 4, wherein said nucleic acid binding domain contains a motif 
selected from the group consisting of a proline pipe helix DNA binding motif, a 
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lSGF3y-Iike DMA binding motif, a SREBP-like DNA binding motif, a coiled-coil motif 
and a nucleotide (ATP)-binding motif. 

1 6. The composition of claim 1 4, wherein said binding domain is selected from the group 
consisting of SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:82, SEQ ID 
NO:83, SEQ ID NO:85, SEQ ID NO:86, SEQ ID NO:87, SEQ ID NO:88, SEQ ID 
NO:89, SEQ ID NO:90, SEQ ID N0:91, SEQ ID NO:92, SEQ ID NO:93, SEQ ID 
NO:94, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, SEQ ID NO:98, SEQ ID 
NO:99, SEQ ID NO: 100, , SEQ ID NO: 101, SEQ ID NO: 102, SEQ ID NO: 103, SEQ 
ID NO: 105, SEQ ID NO: 106, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, 
SEQ ID N0:1 10, SEQ ID N0:11 1. SEQ ID N0:1 12, N0:113, SEQ ID N0:1 14, SEQ 
ID N0:1 15, SEQ ID NO: 144, SEQ ID NO: 145, SEQ ID NO: 146. SEQ ID NO: 147, 
SEQ ID NO: 148, SEQ ID NO: 149, SEQ ID NO: 150, SEQ lDNO:151, SEQ ID 

NO: 1 52, SEQ ID NO: 1 53, SEQ ID NO: 1 54, SEQ ID NO: 1 63, SEQ ID NO: 1 64, SEQ 
ID NO: 165, SEQ ID NO: 166 and SEQ ID NO: 175. 

17. The composition of claim 1 , wherein said polypeptide further comprises at least one 
nuclear localization sequence. 

1 8. The composition of claim 1 7. wherein said nuclear localization sequence is from 
apoBlOO. 

19. The composition of claim 1 7, wherein said nuclear localization sequence is selected 
from die group consisting of SEQ ID NO: 178, SEQ ID NO: 179, SEQ ID NO: 180, SEQ 
ID NO: 194. SEQ ID NO: 195, SEQ ID NO: 196, SEQ ID NO: 197, SEQ ID NO: 198, 
SEQ ID NO: 199, SEQ ID NO: 200, SEQ ID NO: 201 , SEQ ID NO: 202, SEQ ID NO: 
203, SEQ ID NO: 204. SEQ ID NO: 205. SEQ ID NO: 206, SEQ ID NO: 207, SEQ ID 
NO: 208, SEQ ID NO: 209, SEQ ID NO: 210. 

20. A method for expressing a polypeptide in a human cell comprising: 
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(a) providing a composition comprising (i) an isolated polypeptide comprising at 
least one LDL or VLDL nucleic acid binding domain and (ii) a nucleic acid 
comprising an expression cassette comprising a sequence encoding said 
polypeptide and a promoter active in eukaryotic cells, wherein said coding 
sequence is operably linked to said promoter, and wherein said nucleic acid 
sequence is bound to said LDL or VLDL; 

b) contacting said composition with said cell under conditions permitting transfer 
of said composition into said cell; and 

c) culturing said cell under conditions permitting the expression of said 
polypeptide. 

21 . The method of claim 20, wherein said polypeptide is a tumor suppressor. 

22. The method of claim 20, wherein said polypeptide is a cytokine. 

23. The method of claim 20, wherein said polypeptide is an enzyme. 

24. The method of claim 20, wherein said polypeptide is a hormone. 

25. The method of claim 20, wherein said polypeptide is a receptor. 

26. The method of claim 20, wherein said polypeptide is an inducer of apoptosis. 

27. The method of claim 21, wherein said tumor suppressor is selected from the group 
consisting of p53, pl6, p21, MMACl, p73, zacl, BRCAI andRb. 

28. The method of claim 22, wherein said cytokine is selected from the group consisting of 
IL-2, IL-2, IL-3, IL-4, IL-5, IL.6, IL-?, IL-8, IL-9, IL-10, IL-11, IL.12, lL-13, IL-14, 
IL-15, TNF, GMCSF, (J-interferon and y-interferon. 
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29. The method of claim 23, wherein said enzyme is selected from the group consisting of 
cytosine deaminase, adenosine deaminase, p-glucuronidase, hypoxanthine guanine 
phosphoribosyl transferase, galactose- 1 -phosphate uridyllransferase, glucocerbrosidase, 
glucose-6-phosphatase, thymidine kinase and lysosomal glucosidase. 

30. The method of claim 24, wherein said hormone is selected from the group consisting of 
growth hormone, nerve growth factor, insulin, adrenocorticotropic hormone, 
parathormone, follicle-stimulating hormone, luteinizing hormone, epidermal growth 
factor and thyroid stimulating hormone. 



31. The method of claim 25, wherein said receptor is selected from the group consisting of 
CFTR, EGFR, VEGFR, lL-2 receptor and the estrogen receptor. 

32. The method of claim 26, wherein said inducer of apoptosis is selected from the group 
consisting of Bax, Bak, Bcl-Xg, Bik, Bid, Bad, Harakiri, Ad ElB and an ICE-CED3 
protease. 

33. The method of claim 20, wherein said promoter is selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, P-actin, human globin a. human globin p and human 
globin Y promoter. 

34. The method of claim 20, wherein said nucleic acid binding domain is an apoB 1 00 
nucleic acid binding domain. 

35. The method of claim 20, wherein said apoBlOO is selected from the group consisting of 
human, rat and baboon low density apoBlOO. 

36. The method of claim 27, wherein said binding region is selected from the group 
consisting of a proline pipe helix DNA binding motif, a ISGFSy-like DNA binding 
motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and a nucleotide (ATP)- 
binding motif. 
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37. The method of claim 20. wherein said polypeptide further comprises ai least one nuclear 
localization sequence. 

38. The method of claim 37, wherein said nuclear localization sequence is an apoBlOO 

nuclear localization sequence. 

39. The method of claim 20. wherein said polypeptide is selected from the group consisting 
of a-globin, P-globin, y-globm. neomycin resistance, luciferase, adenine phosphoribosyl 
transferase (APRT), mast cell growth factor. 

40. A method for providing an expression construct to a human cell comprising: 

(a) providing a composition comprising (i) an isolated polypeptide comprising at 
least one LDL or VLDL nucleic acid binding domain and (ii) an expression 
cassette comprising a nucleic acid sequence encoding an expression region and a 
promoter active in eukaryotic cells, wherein said expression region is operably 
linked to said promoter, and wherein said nucleic acid sequence is bound to said 
LDL or VLDL; 

b) contacting said composition with said cell under conditions permitting transfer 
of said composition inlo said cell; and 

c) culturing said cell under conditions permitting the expression of said expression 

region. 

4L The method of claim 40. wherein said expression construct comprises an antisense 
construct. 

42. The method of claim 40, wherein said antisense construct is derived from an oncogene. 

43. The method of claim 42, wherein said oncogene is selected from the group consisting 
ras. myc. neu. raf. erb. src.finsjun. trk, ret. gsp, hst, bcl andaW. 
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44. The method of claim 40, wherein said expression construct comprises a nucleic acid 
coding for a gene. 

45 . The method of claim 44, wherein said gene encodes a polypeptide. 

46. The method of claim 40, wherein said promoter is selected from the group consisting of 
CMV IE, LTR, SV40 IE, HSV tk, p-actin, human globin a, human globin p and human 
globin Y promoter. 

47. The method of claim 40. wherein said nucleic acid binding domain is an apoB 1 00 
nucleic acid binding domain. 

48. The method of claim 47, wherein said apoB 1 00 is selected from the group consisting of 
human, rat and baboon low density apoBlOO. 

49. The method of claim 48, wherein said DNA binding region is selected from the group 
consisting of a proline pipe helix DNA binding motif, a ISOFSy-like DNA binding 
motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and a nucleotide (ATP)- 
binding motif 

50. The method of claim 40, wherein said polypeptide further comprises at least one nuclear 
localization sequence. 

51. The method of claim 50, wherein said nuclear localization sequence is an apoB 1 00 
nuclear localization sequence. 

52. The method of claim 40, wherein said gene encodes a polypeptide selected from the 
group consisting of a-globin, p-globin, y-globin, green fluorescent protein, neomycin 
resistance, iuciferase, adenine phosphoribosyl o^ferase (APRT), mast cell growth 
factor. 
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53. A method for treating a human disease comprising: 

a) providing a composition comprising (i) an isolated polypeptide comprising at 
least one LDL or VLDL nucleic acid binding domain and (ii) an expression 
cassette comprising a nucleic acid sequence encoding an expression region and a 
promoter active in eukaryotic cells, wherein said expression region is operably 
linked to said promoter, and wherein said nucleic acid sequence is bound to said 
LDL or VLDL; and 

b) administering said composition to a human subject having said disease under 
conditions permitting transfer of said composition into cells of said human 
subject. 

54. The method of claim 53 , wherein said disease is selected from the group consisting of 
cancer, diabetes, cystic fibrosis and arteriosclerosis. 

55. The method of claim 53 . wherein said promoter is selected from the group consisting of 
CMV IE. LTR, SV40 IE, HSV tk, p-actin, human globin a, human globin p and human 
globin Y promoter. 

56. The method of claim 53, wherein said nucleic acid binding domain is an apoB 100 
binding domain. 

57. The method of claim 56, wherein said apoB 1 00 is selected from the group consisting of 
human, rat and baboon low density lipoprotein apoB 1 00. 

58. The method of claim 53, wherein said polypeptide comprises at least two nucleic acid 
binding regions. 

59. The method of claim 5 8, wherein said binding region is selected from the group 
consisting of a proline pipe helix DNA binding motif, a ISGF3y-like DNA binding 
motif, a SREBP-like DNA binding motif, a coiled-coil motifs, and a nucleotide (ATP)- 
binding motif. 
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60. The method of claim 53, wherein said polypeptide comprises at least one nuclear 
localization sequence. 

61. The method of claims 60, wherein said nuclear localization sequence is an apoB 1 00 
nuclear localization sequence. 



62. The method of claim 53, wherein said nucleic acid encodes a gene. 

63. The method of claim 53, wherein said expression construct comprises an antisense 
construct. 



64. A pharmaceutical composition comprising: 

(a) an isolated polypeptide comprising at least one LDL or VLDL nucleic acid 

binding domain; and 

(b) a nucleic acid comprising an LDL or VLDL binding sequence, wherein said 
nucleic acid is bound to said polypeptide; 

said pharmaceutical composition being dispersed in a suitable diluent. 

65. A method of transforming a cell comprising: 

a) providing a cell; 

b) contacting said cell with a composition comprising (i) an isolated polypeptide 
comprising at least one LDL or VLDL nucleic acid bmding domain and (ii) an 
expression cassette comprising a nucleic acid sequence encoding an expression region 
and a promoter active in eukaryotic cells, wherein said expression region is operably 
linked to said promoter, and wherein said nucleic acid sequence is bound to said LDL or 
VLDL; 

wherein expression of said expression region is indicative of said transformation. 



66. A method of transfecting a cell comprising the steps of: 
a) providing a cell; 
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b) contacting said cell with a composition comprising (i) an isolated polypeptide 
comprising at least one LDL or VLDL nucleic acid binding domain and (ii) an 
expression cassette comprising a nucleic acid sequence encoding an expression region 
and a promoter active in eukaryotic cells, wherein said expression region is operably 
linked to said promoter, and wherein said nucleic acid sequence is bound to said LDL or 
VLDL; and 

wherein expression of said expression region is indicative of said transfection. 
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Identification of the regions of apo B- 
proteins compared in Figures 2A 



100 and the 
-2D. 



Reference Protein Name: 
Apo B-lOO region Bl (aa 24-69) 



SEQ ID NO. 
SEQ ID N0:3 



r9 (aa 66-114). cell division control 
prote in 25 gim|4857 

Apo B-IQQ region 82 (aa 75-119) 

r33 (aa 69-114). Abl proto- oncogene 
tyro sine kinase (P150) gim|l3887 

A po B-IOQ region B3-1 (aa 240-283) 

r35 (aa 799-841). 1- 
Phosphatidyl i nosi tol -4 . 5-bi sphosphate 
phosphodiesterase gamma (PLC-gamma. 
PLC-II) gimi 18895 

Apo B-IQO region B3-2 (aa 240-284) 

rl8 (aa 69-114). Lck proto-oncogene 
tyrosine kinase (P56-LCK) giml 14213 

Apo B-lOO region B4 (aa 457-518) 

r52 (aa 57-109). BLK protein tyrosine 
kinase (B Imphocyte kinase) (P55-BLK) 

gim|l3991. 

Apo B-lOO region B5 (aa 652-700) 

r34 (aa 984-1031). Hyosin IC heavy 
cha in gim| 16466 

Apo B-lOO region B6 (aa 711-756) 



SEQ ID N0:4 



SEQ ID N0:5 
SEQ ID N0;5 



SEQ ID NQ:7 
SEQ ID N0:8 



SEQ ID N0:9 
SEQ ID NO: 10 



SEQ ID NO: 11 
SEQ ID NO: 12 



SEQ ID NO: 13 
SEQ ID NO: 14 



SEQ ID NO: 15 



r25 (aa 12-61). Phosphatidyl inositol 
3-OH gim|l8072 



SEQ ID NO: 16 



FIG. 2E 
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Identification ofthe regions of apo B-lOO and the 
proteins compared in Figures 2A-2D. 



Apo B-lOO region B9-1 (aa 2497-2547) 


SEQ ID NO. 19 


r35-2 (aa 800-850). 1- 
Phosphatidyl inositol -4. 5-bisphosphate 
phosphodiesterase gamma. (PLC -gamma. 
PLC-II) gim 118895 


SEQ ID NO: 20 


Apo B-lOO region B9-2 (aa 2497-2551) 


rrn t r\ Kin oi 

SEQ ID NO: 21 


r43 (aa 444-496). nuclear fusion 
protein FUSl gim 19498 


SEQ ID N0:22 


r49 (aa 86-134). Pgr Proto-oncogene 
Tyrosine gim 14097 


SEQ ID NO: 23 


Apo B-lOO region BIO (aa 3311-3355) 


r\ tpv \L\r\ 

SEQ ID NO: 24 


r9-2 (aa 66-114). Cell division control 

npr\i"0"in 0^ nnm j/LR^7 
piULtrlil C\J yilll|HOO/ 


SEQ ID NO: 25 


Ann R Tfin ron^nin Rl 1 lA'^/l ^/IR9^ 


^Fn in Kin -PA 

i U IMU . 


l*T/ Vuu L.\J\J / . IMCU Ul Upl 1 1 1 ojr LUoU 1 

Factor 1 (NCF-47K) gim 1 16659 


SFD ID NO -27 


APO B-lOO region B12 (aa 3657-3710) 


SEQ ID N0:28 




SEO ID NO-29 


Ann R-inn rpainn RT^ (aa 4nR^-4nQQ) 


SEO ID NO -30 


r3-2 (aa 163-214)Bem-l orotein aim 3905 

1 \J tmm \ \A\JL X W W J* 1 / ki/ Wl M ^ L/ i W WW 111 ^ till \^ ^ \J \J 


SEQ ID N0:31 


Apo B-lOO region B14 (aa 4180-4222) 


SEQ ID N0:32 


r36 (aa 248-299). Neutrophil NADPH 
oxidase factor (P67-PH0X) gim 16660 


SEQ ID N0:33 


Apo B-lOO region B15 (aa 4179-422) 


SEQ ID N0:34 


r59. Cytoplasmic protein gim 16669 


SEQ ID NO: 35 



FIG. 2F 
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